Back To Schedule
Friday, November 17 • 10:40am - 11:00am
Adaptive Scrooge - Adaptive Thrift Decoding

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Feedback form is now closed.
Deserialization of thrift blobs is an important cost for our realtime data processing jobs. Many of the jobs read only a small subset of the fields but pay the price of deserializing the entire payload. We can reduce this cost with AdaptiveScrooge in both cpu and memory efficiency. The basic idea of AdaptiveScrooge is that we should pay less price for deserializing data that we don’t use. AdaptiveScrooge relies on the fact that we can simply find out which fields are getting accessed by sampling a few events. Based on this we modify the parsing logic to cheaply skip un-accessed fields and thus reduce cpu cost. By not creating objects for the skipped portions we also reduce GC pressure. For workflows where a very small portion of the entire event is accessed this is an order of magnitude cheaper.

avatar for Pankaj Gupta

Pankaj Gupta

Engineering Manager, Twitter

Friday November 17, 2017 10:40am - 11:00am PST