The canary in the goldmine

Informatica’s (INFA) stock took a serious nose dive, falling nearly 28% last Friday as the company warned that they would have revenues well below expectations. This doesn’t come as a surprise. The market for traditional databases and their associated applications should be expected to drop in the age of the Internet, Big Data, Cloud and real-time processing. It would be more surprising if there weren’t big shifts.

If you don’t know Informatica, they are a top ETL (Extraction, Transformation, Load) tool that for years made a tidy business of moving data between applications in some of the largest enterprises in the world.

Why now?

So what makes this event the ‘canary in the goldmine’? Isn’t it supposed to be a ‘coal mine’? Yes and no. Information is an increasingly valuable asset. It is your enterprise goldmine. So what’s up with the canary? I’ll tell you.

SQL (Structured Query Language) has for forty years been the foundational language of the database. The language was a hot skill when the only way to store data was in relational databases, the type that made Oracle an enormous company.

While relational databases dominated and are still valuable, they aren’t the best fit for solving every problem…especially more recent challenges. Enter the NoSQL (I’ll choose it to mean ‘Not only SQL’) era.

To get the full context of that statement, let’s take a look at what’s changed:

  • Unstructured information: Unstructured information can be of different types and lengths (numbers, text, dates, etc.) and can be found in places that can’t be accurately predicted when systems are designed. The storage methods of a relational database require a schema (a map, essentially) to sort out the elements being stored. Relational databases and structure don’t always match up well in an unstructured or fast-changing world. 
  • Cloud: NoSQL databases were designed to work across a ‘Cloud’ of servers rather than on a single, large server that ‘answers a call’ and uses replication to ensure reliability. When we moved to multiple servers to solve size and speed issues in the past, we still needed indexes to sort out where to find things. The move toward big, high velocity data is well suited to a system where the data can be anywhere and in any format.
  • Speed: A relational database is ‘stored’ somewhere and information is ‘called’ when needed (using SQL). With in-memory computing, we keep information ‘in the room’ rather than ‘in the data closet’. It is obviously faster to keep all the info you need readily accessible instead of going to find it. SQL was built with flexibility to store, find and update information in many ways (through structured queries). That flexibility is an unnecessary distraction in the world of high-speed, high-volume computing.
  • Web: Today’s Internet access isn’t about high transactions rates, like processing accounting records for an enterprise. It is about many people looking at the same information but few being able to modify it. That ‘concurrency’ is well-suited to NoSQL databases. It is excessive firepower for a relational database, which was built to lock records during transactions (with associated overhead) and provide integrity and ‘roll back’ if a transaction failed. Today, I want just one person to update and an unlimited number to be able to view information.

These changes doesn’t mean SQL is dead. The significance of these changes is more that SQL and relational databases are going to share time with other ways of storing, finding and changing information. They do mean, though that moving data from one place to another, Informatica-style, isn’t as useful as it once was. It needs to be ‘live’ and ready.

The bottom line

What’s really changed is we’ve moved past the era of information ‘at rest’, waiting for someone to need it, and are in the age of ‘data in motion’. The opportunities to put information to work solving problems through in-the-moment analytics, visualization and actionable intelligence are fast coming into focus.

It creates a whole new world of opportunities for solving health, energy, food supply, commerce and other challenges.

There will be more disruption in the technology landscape in the coming months as this trend continues and investors move toward the software that enables this new world. 

Advertisements

Tags: , , ,

Categories: Data Analytics / Big Data, Featured, Information Technology, Real-time

Author:Chris Taylor

Reimagining the way work is done through big data, analytics, and event processing. There's no end to what we can change and improve. I wear myself out...

Subscribe to the blog

Subscribe and receive an email when new articles are published

3 Comments on “The canary in the goldmine”

  1. Mark Eastwood
    July 9, 2012 at 6:40 am #

    Chris,

    I find a couple of your points calling me to comment. You say “The move toward big, high velocity data is well suited to a system where the data can be anywhere and in any format.” I’ll be the first to admit that I’m not a DBA, I’ve focused my career on other aspects of decision automation and decision support, but I also have a background in real-time software such as the operating systems and applications on telephone switches. In my mind hard real-time systems fail if any component of the system is unable to process within an allotted time slot. High-speed telephony is synchronous even between switches.

    Perhaps it’s the terminology here, but I take “high velocity data” to be potentially large amounts of data that’s generated fast and potentially from multiple sources. In my past the only way to deal with “high velocity data” like the call-details records of a switch that might have millions of call-attempts in a short time was to impose some structure on the data. If I think about the classic “unstructured data” of say an email, the overhead of parsing the message to make some sense of it before taking action is significant. It seems incongruous to efficient processing of “high velocity data” without significant parallelism that would be cost-prohibitive.
    Later in your post you appear to agree with me when you say “SQL was built with flexibility to store, find and update information in many ways (through structured queries). That flexibility is a[n] unnecessary distraction in the world of high-speed, high-volume computing.” I would argue that being overly flexible with the source of data – unstructured data – is also an unnecessary distraction if you need very high-performance.

    As you came back around to the example you had in mind at the beginning, I agree with you that the internet isn’t about hard, real-time computing or even high-speed computing it’s about very large numbers of people constantly accessing read-only data with relatively modest performance expectations. I do agree that SQL shares the internet with NoSQL applications, each having their place in the architecture of the internet but I think I disagree with the statement that suggests ETL isn’t as useful. Each technology has its place and ETL isn’t going away any more than COBOL. Rather, use the right technology for the job – live and let live. We’re definitely move more and more towards in-the-moment analytics and decision automation and that’s a good thing.

    Mark

  2. July 9, 2012 at 8:35 am #

    Thanks for the typo correction…fixed. I think ETL remains useful, but it isn’t the critical capability it once was when data was being moved in batches between mostly silo’d systems. Part of the challenge to Informatica is that other systems perform ETL quite well, including workflow, eventing and rule engines, and they do it on the fly. We’re less of a batch world than we used to be.

    Thanks for your comments!

Trackbacks/Pingbacks

  1. Big Data without process is creepy | Successful Workplace - August 4, 2012

    […] days are coming to an end. Unstructured information is a fact of life, the Cloud is a place to put nearly everything, the […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: