RealTime IT News

A Look at Google's Open Source Protocol Buffer

For most organizations Extensible Markup Language, or XML , is the lingua franca for data interchange. Apparently XML alone isn't fast enough for Google (NASDAQ: GOOG), so Google went off and developed its own data format, called Protocol Buffers.

This effort has been in development at Google since 2001. It's now available as an open source project Google hopes others will use and contribute toward. Protocol Buffers could ultimately replace XML in some cases as a speedier format for data interchange.

"We do know that we will be using it ourselves in some of our upcoming projects," Google developer Kenton Varda said. "This is not a piece of software that is unimportant to the company."

Google's documentation on Protocol Buffers noted that the new format has numerous advantages over XML. Among the advantages cited by Google is the fact that Protocol Buffers could be 3 to 10 times smaller and 20 to 100 times faster than XML for serializing structured data.

"You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages," Google's documentation states.

Currently Google is using Protocol Buffers for its internal Remote Procedure Calls, or RPC, protocols and file formats.

According to Google's documentation, protocol buffers were initially developed at Google to deal with an index server request/response protocol.

Chris DiBona, Google's program manager for open source, noted Google encodes almost any sort of structured information that needs to be passed across the network or stored on disk using this protocol.

As to why after years of in-house development Google is now making Protocol Buffers open source, Varda said it's just a question of time. "We have wanted to release protocol buffers for a long time," he said. "The only limitation was finding enough engineering time to get it done."

Google will release Protocol Buffers under the Apache 2.0 open source license, and some of the technology involved may well be patented. That shouldn't be a concern for potential users, however.

"There is some patent activity around Protocol Buffers, but I'd like to point out that we use the Apache license, which grants permission to use any applicable patents," DiBona told InternetNews.com.

The potential for Protocol Buffers could well be large. Google is not currently using Protocol Buffers as a replacement for XML-based Web services -- at least not yet. In response to a question from InternetNews.com about whether Protocol Buffers could be leveraged to create some kind of smaller, faster Web services/SOA alternative, Google developer Varda noted, "That sounds like a possibility, but we have no firm plans at this time."

So far, Google has included support for C++, Java, and Python for protocol buffers, though other languages are welcome.

"We would love for there to be PHP support for Protocol Buffers, and we hope that the open source community will take this up," Varda said. "We would be happy to provide whatever assistance we can."

In fact participation in continuing the development of Protocol Buffers is something Varda hopes will happen now that the technology is open source.

"We welcome participation from the open source community," Varda commented. "Managing broad participation in development of such a critical piece of Google's infrastructure will be tricky, but we're going to try."