The non-hype definition of Big Data

Big Data 3 V'sIn all the hype, we can easily get caught up in the limitations of the standard definition of Big Data, often defined as the ‘3 V’s’:

  • Variety – Data that has many sources and structures/formats
  • Velocity – Data coming in very quickly that is defined by its flow rate and/or accumulation
  • Volume – Terabytes, petabytes and zettabytes of data…data increasing at 40% annually

This description gives the concept some clarity. However, a much more pragmatic definition can be found on Wikipedia (The irony, of course, is that Wikipedia itself is the ‘Big Data version’ of the Encyclopedia Britannica, disrupted by this very definition):

In information technology, big data is a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools.

By defining Big Data by the challenge of using on-hand tools, we acknowledge that the term means something different to everyone. If you’re Nielsen and have been managing massive quantities of marketing intelligence for decades, Big Data isn’t a term you even use. If you’re a retailer trying to compete with the likes of Macys and Nordstrom, Big Data is a significant risk to your business. This is a very, very important point to make. Big Data is something different to everyone and should be.

Tip of the iceberg

Screen Shot 2012-12-11 at 3.36.48 PMAnd because it means something different to everyone, the applications necessary to work out Big Data problems vary by situation. In fact, the biggest ‘tool’ for solving Big Data challenges is the infrastructure that collects, sorts, and serves up data. The best phrase to clarify this point: Most Big Data applications are just the ‘tip of the iceberg’ for solving the problem.

And beyond bringing data to the table, that same infrastructure is key to following through on the insights that organizations gain from analyzing data. Without it, event processing, workflow, customer interaction and everything else that makes business ‘work’ isn’t possible.

This may be a disappointing piece of news, especially those whose livelihood depends on the hype. The reality is that Big Data solutions come back to the same technology fundamentals that have always mattered.


Tags: , ,

Categories: Data Analytics / Big Data

Author:Chris Taylor

Reimagining the way work is done through big data, analytics, and event processing. There's no end to what we can change and improve. I wear myself out...

Subscribe to the blog

Subscribe and receive an email when new articles are published

6 Comments on “The non-hype definition of Big Data”

  1. December 12, 2012 at 5:09 am #

    Reminds me of what I wrote last week in my Amazon Web Services user conference trip report:

    “Amazon Web Services are a wide and deep defacto reference implementation of a cloud computing ecosystem. One the great things about reference implementations is they lend themselves to “crisp”, rather than “fuzzy” definitions. The first sentence of Getting Started Guide: Analyzing Big Data with AWS defines big data simply and elegantly:

    ‘Big data refers to data sets that are too large to be hosted in traditional relational databases and are inefficient to analyze using nondistributed applications.’

    Of course, everything looks like a nail if you have a hammer (Amazon Elastic MapReduce, yes, Amazon has an EMR!). In that sense this is a self-serving definition. But that does not diminish its relative crispiness relative to a multitude of other definitions of big data that do not rely on specific software platforms and tools.”

    Cheers! (and, soon, Merry XMas/Happy New Year)


  2. Dencie
    December 18, 2012 at 6:25 am #

    Great article and clarity on Big Data. Data coming from various sources, specifically external providers, becomes a trust issues as the “bigger the data” the higher the likelihood that something will go wrong. My experience dealing with large external data vendors is to assume some data will be erroneous and not structured as layed out in their data dictionary. We learned the hard way by being too trusting and our models were thrown in a loop leading to huge performance issues and some embarrassing results because of bad data. The term ‘garbage in and garbage out’ really does mean a lot and should taken seriously at initial stages at analyzing data. Perform a sizeable amount of testing to get a decent amount of understanding at where data anomalies can occur so that you are able to deal with these issue in your application.


  1. Hadoop: A race car without wheels? | Successful Workplace - December 12, 2012

    […] There was a clear theme that emerged…Hadoop and other Big Data-specific tools are just the tip of the iceberg. By themselves, they solve […]

  2. Big data means everything and nothing (it depends who you ask) | Successful Workplace - April 29, 2013

    […] earning a living with vast amounts of data for decades. There are exceptional examples of this like Nielsen, the company that started off rating the advertising value of media and morphed into consumer […]

  3. Interop Blog | News from the Leading IT Conference and Expo » Blog Archive » Big data means everything and nothing (it depends who you ask) – #Interop - May 2, 2013

    […] earning a living with vast amounts of data for decades. There are exceptional examples of this like Nielsen, the company that started off rating the advertising value of media and morphed into consumer […]

  4. Process is data and data is process | Successful Workplace - September 19, 2013

    […] embracing what’s coming next and becoming data and process powerhouses. Watch the progress of Nielsen and others and you’ll see exactly where the global economy is […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: