by Thor Olavsrud

Facebook Open Sources Thrift Protocol … Again

Feb 20, 20143 mins
Enterprise ArchitectureOpen Source

After more than six years of internal development of its branch of the cross-language framework that powers its internal services, Facebook has released that branch to open source and hopes to work with the Apache Thrift community to incorporate the work.

Facebook today “re-open-sourced” the Thrift binary communication protocol with its own internal branch of Thrift, which is designed to provide a new set of core features and crank up performance.

Facebook Software Engineer Dave Watson explains that the company always wants to choose the best tools and implementations for its backend services, regardless of programming language. By using programming languages on a case-by-case basis, it can optimize performance, ease and speed of development, leverage existing libraries and so on.

Facebook Open Sources Thift

“To support this practice, in 2006 we created Thrift, a cross-language framework for handling RPC [remote procedure calls], including serialization/deserialization, protocol transport and server creation,” Watson says. “Since then, usage of Thrift at Facebook has continued to grow. Today, it powers more than 100 services used in production, most of which are written in C++, Java, PHP or Python.”

After a year of internal use, Facebook released Thrift to the open source community, where development of Apache Thrift continues. But, as Watson notes, while Apache Thrift gained wide use outside Facebook, IT organizations using it ran into performance concerns and issues separating the serialization and transport logic.

Inside Facebook, IT was running into similar issues as it gained experience running Thrift infrastructure. Watson says the team realized that Thrift was missing a core set of features, and that a lot more could be done for performance.

“For example, one issue we ran into was that internal service owners were constantly reinventing the same features again and again—such as transport compression, authentication and counters — to track the health of their servers. Engineers were also spending a lot of time trying to eke more performance from their services.”

“When Thrift was originally conceived, most services were relatively straightforward in design,” Watson adds. “A Web server would make a Thrift request to some backend service, and the service would respond. But as Facebook grew, so did the complexity of the services. Making a Thrift request was no longer so simple. Not only did we have tiers of services (services calling other services), but we also started seeing unique future demands for each service, such as the various compression or trace/debug needs.

Over time, Watson says, it became obvious that Thrift was in need of an upgrade for some of our specific use cases. In particular, we sought to improve performance for asynchronous workloads, and we wanted a better way to support per-request features.”

The end result is fbthrift, which Facebook released today on GitHub. Watson says the largest changes are in the new C++ code generator (available as the new target language cpp2), as well as header transport and protocol changes for several languages, including C++, Python and Java. He adds that a number of services that have moved to the new cpp2-generated code have achieved up to a 50 percent decrease in latency and large decreases in memory footprint.

Watson notes that it doesn’t reflect all Apache Thrift changes, but the team did track the upstream changes closely, and he adds that Facebook hopes to work with the Apache Thrift maintainers to incorporate the work on fbthrift.

Thor Olavsrud covers IT Security, Big Data, Open Source, Microsoft Tools and Servers for Follow Thor on Twitter @ThorOlavsrud. Follow everything from on Twitter @CIOonline, Facebook, Google + and LinkedIn.