Twitter's Storm Software -- Hadoop of Real-Time Processing -- Available as Open-Source

Storm, a scalable real-time computation system that Twitter got when it acquired analytics platform BackType in July, is now available on Github

Storm eliminates the need to manually process messages from a queue, update a database and then send messages to another queue. Bonus: it can be used with any programming language. For example, Storm makes it possible to stream Twitter trending topics into web browsers. 

Twitter engineer Nathan Marz, who has called Storm the Hadoop of real-time processing, debuted the open-source version of Storm at the Strange Loop 2011 developers conference. Nathan was previously the lead engineer at BackType.

"Its use cases are so broad that we consider it to be a fundamental new primitive for data processing, " Nathan wrote on the BackType blog in June, before Twitter acquired the startup.

We have asked both Nathan and Twitter for comment on Storm, which BackType had long planned on releasing as open-source. Nathan confirmed on the Twitter Engineering blog on Aug. 4 that the open-source version would come out today.  

"The approach I take is to build systems using a hybrid of batch processing (Hadoop) and realtime processing (Storm)," Nathan writes on Hacker News today. "With batch processing, you can run idempotent functions even with duplication, which lets you correct what's happening at the realtime layer. In that sense, Hadoop and Storm are extremely complementary."

"This is fantastic!" UX hacker and graphic designer Grant Jordan writes. "My mind is spinning with the industries that you could benefit from this, but didn't have the time/resources/focus to roll this sort of (very difficult to scale) system on their own."


1. "The Secrets of Building Realtime Big Data Systems" (Nathan Marz on Slideshare, April, 2011)


Nathan Marz, Storm Lead Engineer 
Twitter: @nathanmarz!/nathanmarz