Hadoop ages quickly if it isn’t real-time

African_elephantOver the past two years, we’ve watched a significant chunk of the world fall in love with Hadoop. It has been remarkable to watch Big Data startups grab the spotlight and every major vendor look to warm themselves by the Hadoop fire. From the beginning, we’ve said that Hadoop was the first to offer a way to harness the power of enormous data sets but warned that it lacks the wherewithal to enable real-time decision making. We recognized that any technology that doesn’t change the here and now has limited value.

The growing chorus

Today’s GigaOM ran the story 5 reasons why the future of Hadoop is real-time, (Relatively speaking) and the title alone speaks volumes about where Big Data’s true value is found. The essence of the article is found here:

The work being done by companies like Cloudera and Hortonworks at the distribution level is great and important, as is MapReduce as a processing framework for certain types of batch workloads. But not every company can afford to be concerned about managing Hadoop on a day-to-day basis. And not every analytic job pairs well with MapReduce.

The world is waking up from a Big Data bender to discover that speed matters enormously…more every day. So much so that using SQL on Hadoop is the new conversation as that ‘old school’ query language becomes the perfect way to pair massive data with in-the-moment queries. Look no further than the rise of HBase as proof that everything isn’t headed for NoSQL.

Just a warm-up

All of this is the preamble for the need to manage Big Data as it comes across the threshold as fast-moving streams. We’re just now seeing the conversation about how to manage the fast-approaching Internet of Things and its projected 50 billion sensors (25X the current 2 billion ‘human sensors’. Managing stream data is going to put an enormous burden on the infrastructures of anyone who wants to stay relevant. If Hadoop isn’t in this game, it dies.

Did you hear that noise? That was the sound of infrastructure and integration emerging from the background to be the most critical parts of serving Big Data.


Tags: ,

Categories: Data Analytics / Big Data

Author:Chris Taylor

Reimagining the way work is done through big data, analytics, and event processing. There's no end to what we can change and improve. I wear myself out...

Subscribe to the blog

Subscribe and receive an email when new articles are published

4 Comments on “Hadoop ages quickly if it isn’t real-time”

  1. March 14, 2013 at 9:27 am #

    Um, bit of a straw-man argument here for fourreasons.

    1) Low latency access via SQL access (ours, Impala and other) is addressing this and there are standards bodies addressing this as well.

    2) Not every job needs to be real time, what most need is to happen in what I call “‘customer time” and Hadoop-based systems can be plenty fast for that

    3) It isn’t Hadoop or nothing – in the real world you optimize architectures based on the role they play so if you are slotting Hadoop (as it stands today) into a low latency environment you frankly don’t know what you are doing.

    4) You can use Streaming engines and Hadoop today, we do all the time

  2. March 16, 2013 at 9:27 am #

    SQLstream with Hadoop are the right tools for the job! Real-time streaming analytics based on true SQL-standards continuous queries.


  1. Is Hadoop the right tool for the job? | picnicerror.net - March 9, 2013

    […] Hadoop ages quickly if it isn’t real-time […]

  2. Is Hadoop to slow because it isn’t real-time? No! - #smarterindustry - October 9, 2013

    […] I’ve read an aguement in blog: […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: