Monday, 18 July 2011

Google+ Is Built Using Tools You Can Use Too Like Closure, Java Servlets, JavaScript, BigTable, Colossus, Quick Turnaround

Joseph Smarr, former CTO of Plaxo in I'm a technical lead on the Google+ team. Ask me anything, reveals the stack used for building Google+.

Our stack is pretty standard fare for Google apps these days: we use Java servlets for our server code and JavaScript for the browser-side of the UI, largely built with the (open-source) Closure framework, including Closure's JavaScript compiler and template system. A couple nifty tricks we do: we use the HTML5 History API to maintain pretty-looking URLs even though it's an AJAX app (falling back on hash-fragments for older browsers); and we often render our Closure templates server-side so the page renders before any JavaScript is loaded, then the JavaScript finds the right DOM nodes and hooks up event handlers, etc. to make it responsive (as a result, if you're on a slow connection and you click on stuff really fast, you may notice a lag before it does anything, but luckily most people don't run into this in practice). Our backends are built mostly on top of BigTable and Colossus/GFS, and we use a lot of other common Google technologies such as MapReduce (again, like many other Google apps do).


t first I read Clojure, which would have been a real surprise, but it's Closure, a suite of JavaScript tools consisting of a library, compiler, and templates. The compiler is a true compiler for JavaScript for making JavaScript download and run faster.  The library is modular and cross-browser JavaScript library. Templates is a server-side templating system that helps you dynamically build reusable HTML and UI elements. It's all open source so you can use it too.

While you don't have Google's stack available to you, you do have some open source options. HBase is replacement for BigTable. Then there's Hadoop MapReduce. Colossus is Google's next generation file system, a replacement for GFS. Since we don't know much about Colossus, it's hard to say what a suitable replacement would be. There's the Hadoop distributed file system HDFS. And if you are looking for some of the cloud like infrastructure glue there's OpenStack (which also has storage system).

Google probably uses a custom Java servlet container, but the choice here doesn't matter that much. Most of the work will be spawned in parallel and performed on other servers implemented in C++, Java, or Python.

Whereas most communication with Google is non-existent, the Google+ development team is noticeably more responsive, turning out visible improvements quickly and consistently. Joseph tells us why: We put extra emphasis on engineering speed/agility--we try to release code updates on a daily basis while still keeping quality/stability/latency as high as you'd expect from google. This helps us move fast and respond quickly to user feedback. We try to do a full push (almost) every day, and we sometimes sneak in patch releases too if needed. But there are humans in the loop, it's not a "auto-push if all tests pass" situation or anything like that.

For Google+'s most innovative feature, video conferencing with Hangouts, GigaOM has a good article on that stack, which is based on Announcing Google+ Hangouts, written by Tech Lead Justin Uberti. Unlike Skype, which runs on a cost effective P2P model, Hangouts is completely hosted by Google. This must cost a staggering amount of money. You are on your own here. Nobody can replace the bandwidth being donated by the Google fairy.

That's Google in a box. Then again, an ex-Googler thinks you can do better using MessagePack, JSON, Hadoop, jQuery, and MongoDB. If you can do better for a world wide base of a billion users is a completely different question.