"Data, Data, everywhere, Nor any information to think"\n-Paraphrasing Samuel Taylor Coleridge's famous lines from the Rime of the Ancient Mariner.\nOften at time it does feel like we are in a "paradox of plenty" kind of situation, somewhat akin to a resource curse where historic corporations with an abundance of data are finding themselves losing the race of market competitiveness to newer players who have much less data. \u00a0\nWhy?\nMy initial thoughts were that till recently the focus of most corporations had been on mining their "historical" data.\nHowever, with the world of today generating a steady and ever-growing stream of "real-time" or "near real-time" data, corporations need to wake up to the new reality that much of their historical data is not as relevant or valuable as they think it is.\nIn the absence of real-time data, historical data is often used as a proxy to make some predictions. But with real-time data being available now, that proxy is no longer needed or is no longer as relevant.\nThis has a big benefit \u2013 corporations that feel that they had fallen behind in the race to mine historical data do not necessarily need to play catch-up. They can make up for the lost opportunity by creating a framework to leverage real-time data streams.\nEssentially, corporations can leapfrog and catch up with or even move beyond other players without getting caught up in what I'll call the legacy data trap \u2013 ditch it, since most of it may not be as relevant as you think. Food for thought?\nWhat does the Data Doc think?\nI bounced this idea off Tom Redman,\u00a0"the Data Doc." He was skeptical. While, he agrees that companies need to wake up, he had two reasons for his skepticism. \u00a0\nFirst, real-time and historical data support different sorts of analyses and opportunities. He did not see one as a surrogate for the other.\nSecond, the biggest "gap" is the ability to analyze data and sort out what to do with those analyses. Real-time data does not address that gap.\nTom made some great points.\nMy response\nTill now most of the energy and resources of corporations were devoted to "historical" data, since the capabilities to harness real-time or near real-time data did not exist. Now suddenly there has been an explosion in both the volume of the real-time data as well as the tools to manage it.\nAs a result, there will be a shift of attention and resources from historical to real-time since both attention and resources are fixed and limited. Also, for many areas, an effective handle on real-time data is all that may be needed.\nFor example, we drive on the roads just using real-time data presented on the dashboard (speed, rpm, engine temperature) with no need of any historical data to meet the immediate need of going from point A to point B.\nWhat do you think?\nThis could be an interesting survey question to ask CIOs and CDOs:\nOf your total data management spend how much will you allocate to mining historical data vs. managing real-time data and why? \nThis may offer some interesting insights on how this entire area is evolving.\nWhat implications does all this have on data strategy?\n\nExact vs. Roughly Right: For historical data, the emphasis on getting all data in the right formats, with right definitions and in common data stores, needs to go. Such an approach has led to the mental and execution block that no meaningful insights are possible till considerable time and resources are spent on getting it all "right."\n\n\nConsolidation vs. Federation: Approaches where data is pulled from various data sources into a single repository need to be replaced by approaches where data stays in its parent repositories but gets "pulled" as needed. A federated data application framework?\u00a0IBM Watson Discovery Service\u00a0 does something like that but seems like it does it only for unstructured data. Fraxses seems to do it for both structured and unstructured data.\u00a0With the kind of capabilities available now, physically moving data into a distinct data store (lake) may not be required. The lake may be virtual. This may be a quicker approach too.\n\n\nInternal vs. External: In most corporations, data strategies have been inward looking. That is, they have focused on internal data. In today's world, any meaningful data strategy has to focus on internal as well as external data. How can you combine internally available data with publicly available or acquired external data to deliver business focused insights is a question the strategy needs to answer.\n\n\nDefense vs. Offense: Data strategy should enable support of both "exact" reporting (e.g., for finance and accounting purposes) as well as "directional" reporting (e.g., for strategy and business development purposes). Till now the focus has been on exact, which has meant all available data has not been effectively utilized. There is always a significant amount of data which is not "exact" but can still provide meaningful insights when weighted appropriately (e.g.,\u00a0Watson when playing Jeopardy did not come up with just one correct answer but several with appropriate weights).\u00a0A recent Harvard Business Review article,\u00a0"What's your data strategy?" described it as defense vs. offense: Companies make considered trade-offs between defensive and offensive uses of data and between control and flexibility in its use.\u00a0Leandro DalleMule\u00a0and Thomas H. Davenport\u00a0summed it up well in that article:\n\nThere is no avoiding the implications: Companies that have not yet built a data strategy and a strong data-management function need to catch up very fast or start planning for their exit.