We used to know the purpose of information before we stored it. In fact, the mere act of defining the storage and retrieval patterns was what made it ‘data’. Going one step further, anything we didn’t store was by definition not usable.
Those days are coming to an end. Unstructured information is a fact of life, the Cloud is a place to put nearly everything, the high volume and velocity of data keeps us from planning too far ahead, and the ‘Internet of things’ means we have plenty of information to see that isn’t ‘ours’ to create, modify or delete.
We’ve entered the era where we can easily grab large amounts of unstructured information before we know what to do with it, or as Alistair Croll says in Big data is our generation’s civil rights issue, and we don’t know it, “…collect first and ask questions later.” He makes great points about the thin line between personalization and discrimination.
As Croll points out, simple things like our music tastes can be used to infer our gender, religion and race. It can go even further, as in the now-famous story by Forbes’ Kashmir Hill of Target’s study of what women buy when pregnant, “How Target Figured Out A Teen Girl Was Pregnant Before Her Father Did“. We know humans are fairly predictable, and as Hill points out, enough data makes our predictability scarily accurate.
When we’re highly predictable, we can be segmented by those who are more predictable than others. We can be targeted for our likelihood to behave a certain way, and this comes with some risk depending the who is calling the shots and what is on the agenda.
De-creeping Big Data
So how to limit this problem? How do we de-creep the use of Big Data? The simplest and most effective way isn’t anything new. It is to build in safeguards for who can see and analyze data and to make the process for its use transparent and managed. This means log files for access and analytics (easily done), it means rules that watch those logs and it means business process management for the way work is done.
I’ll hazard a guess that if industry doesn’t come to this conclusion on its own, high-profile misuse of Big Data will bring on regulation and compliance that will take the fun out of something powerful and exciting.