Shopping in the Big Data marketplace

Big DataThe Big Data marketplace is diverse and growing. It has a host of companies selling products that are called big data solutions but some of that is opportunism and some is deliberate obfuscation of the concept. Let’s take a look at the landscape of products and categories and try to make sense of what’s out there.

Open source

Open source applications are freely distributed and benefit from ‘donated’ code by an entire collaborative community of developers. While the model sounds like nirvana, there are challenges around code quality and support that have kept open source from being as trusted as software supported by a vendor. Big Data open source applications include the following:

  • Hadoop – A great deal of focus has been given to Hadoop, the Apache Foundation’s technology for MapReduce. Hadoop allows scalable, distributed computing on commodity hardware and has programmatic ways to handle distribution and fail over. This technology traces its roots to some of the biggest companies solving the biggest of the Big Data problems. Hadoop stores data in the HDFS (Hadoop Distributed File System).
  • Hive – This is the data warehouse application for Hadoop that facilitates easy data summarization, queries and analysis of data stored in HDFS. Hive provides structure for data and has a query language known as HiveQL (Hive Query Language).
  • Cassandra – Cassandra is Apache’s in-memory database. In much the same way Hadoop provides distributed computing and storage, Cassandra allows the cache memory of clustered servers to be accessed and managed for especially fast data management.
  • MongoDB – A significant part of Big Data’s story is unstructured data. MongoDB is the answer to that challenge, where data can’t be filed away in relational databases because of its unstructured nature. This allows integration of data to happen faster and more easily when structure can’t be known and planned for ahead of time. The definition of the data, typically known as a ‘schema’, is dynamic rather than fixed.

Point solutions

Besides open source applications, there are vendors who sell solutions that are often built on open source but are not free but offer support models and much tighter quality control and versioning. Here are three well-known examples:

  • Platfora: Platfora is based on Apache Hadoop but claims to provide sub-second interactive, exploratory business intelligence and analytics. it’s target is business analysts who need to manipulate large data sets but don’t have the technology skills to program for themselves.
  • DataMeer: Similar to Platfora, DataMeer claims to have an application that requires no extraction, transformation or load (ETL), doesn’t require static schemas and give analytics and visualization to business users.
  • Cloudera:  Their claim to fame is the ability to bring unstructured and structured data together into a single view in real time. As business users think of new ways of querying data, Cloudera allows for very rapid slicing and dicing.

Big vendors

Not surprisingly, the big software vendors won’t be outdone by open source or point solutions. The following are the major players in the Big Data space that you already know quite well:

  • IBM – Big Blue is an enormous company that often has multiple overlapping solutions for anything involving data. The challenge with IBM is figuring out what you really need and getting to the right amount of software and consulting. IBM is enormous and they expect enormous deals.
  • Oracle – Like IBM, Oracle is a giant company that started in relational databases and has grown through acquisition to include just about every software under the sun. Oh yeah, they also bought SUN Microsystems. Getting to a solution involves negotiating a myriad of products.
  • SAP – These are the clear winners of the enterprise resource planning (ERP) wars of the past ten years. SAP emerged as the largest and most successful vendor of software applications so large they can’t be described in a single blog. Their Big Data play is to say that everything you need affects enterprise resources. In for a penny, in for a few million dollars, though, and many aren’t looking for that kind of disruption/investment.
  • TIBCO –  A much smaller company than the ones above, TIBCO is known for being the premier data integration company. If you stop to think, though, the biggest challenge of Big Data is getting the information, analyzing it, understanding it, anticipating what’s coming next, and acting on it when it happens. Small but powerful, TIBCO has a single stack of products that solve Big Data problems without complexity or redundancy.

The solution that is a good fit for the marketplace depends entirely on the problem at hand. Looking at the most impactful projects we’ve witnessed, the following are some that we feel are leading the way:

Customer experience

The retail and services marketplace has long been operated under “best guess” with only outdated business intelligence (BI) reports to show what happened in the past. Big Data is about having enough information to be able to find key insights that predict what will happen. Predictive analytics are more and more in the hands of business users who can plan for how to delight customers before the moment arrives.


Supply chains are increasingly global and more complex every day. Big Data technology allows organizations to sort through the noise to find the signals that indicate problems in the making and make decisions before the costs go up and contract terms are missed.


Gaining customers is a challenge and keeping them is all about loyalty. Big Data provides a way to create profiles and to know your customer’s preferences and current location so that history, inventory and the current situation can all be considered in making real-time offers. This is the secret to turning customers into fans.


Big Data about energy forecasts, generation and consumption stands to lower the cost and increase efficiency of our business, homes and transportation. With so much attention on making our lives more ‘green’, Big Data offers the chance to solve energy problems in all new ways.

Beyond the hype, Big Data offers enormous opportunity to know our world in very detailed fashion and to make decisions based on much more than hunch. Let us know your Big Data stories and we’ll be happy to share them.


Tags: , , , , , ,

Categories: Data Analytics / Big Data

Author:Chris Taylor

Reimagining the way work is done through big data, analytics, and event processing. There's no end to what we can change and improve. I wear myself out...

Subscribe to the blog

Subscribe and receive an email when new articles are published


  1. Shopping in the Big Data marketplace or is that minefield… « Sykes' Blog - December 8, 2012

    […] See on […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: