by Rohan Light

Choosing a smarter metaphor for your data governance strategy

Jan 16, 20188 mins
Data ManagementIT GovernanceIT Leadership

data lake
Credit: Thinkstock

Organizations want to be “data-driven.” The gist of which is they want people using data to make decisions. Leaders know that too many people make stuff up. And every leader has done it themselves, so it’s not as if it’s a secret or anything. So, we want our data to drive our organization. But what does that even mean?

Is being data-driven like being chauffeur-driven, where we sit back and get to our destination on what we trust is the best route? Or is it more like driving cattle, where enough of us form a herd and move along a path not of our choosing?

Where do we as people fit in to these scenarios? In neither are we making decisions, which, in the world of data, is our primary responsibility.

Being “data-driven” is lazy business speak for evidence-based decision-making, which is about having a sound, transparent and reproducible decision-making process and governance strategy. But as often happens, we lose a lot of rigor when we carry scientific methods into the business world.

Data isn’t like other organizational assets

Generally speaking, we like the idea of basing our decisions on evidence. Well, some of us at any rate. But when it comes to the management and use of data, things aren’t so straightforward. Anyone can throw a bunch of numbers onto a slide deck and get a group of people nodding. The problem isn’t getting enough data. It’s getting data that fits the decision at hand. The emphasis here is on how we make the data fit, because few of us are well-suited to analyzing and interpreting data. And if “the root problem is that we know very little about how people analyze and process information,” how are boards to set effective data strategy?

Data is a unique beast and doesn’t behave in the same way as other organizational assets. How we choose to view it will be a primary determinant of how well we can realize its value, balance off its risk and observe its regulatory constraints. This means one of the most important decisions a board will make about its enterprise data strategy is what metaphors to use.

For example, consider this quote from an article by Alexis Madrigal for Fusion TV:

“We’ve deceived ourselves into thinking data is a camera, but it’s really an engine. Capturing data about something changes the way that something works. Even the mere collection of stats is not a neutral act, but a way of reshaping the thing itself.”

If we were a business making cameras and one of our people said that we produce engines, we might laugh them off. Unless, of course, they were correct, in which case we have a serious strategy problem to deal with. How would we know if we’re making cameras or engines? And what if there’s a market for camera engines or engine cameras? Considering that autonomous vehicles are supercomputers on wheels, the difference between cameras and engines is getting very thin.

Metaphors introduce framing effects

This isn’t an empty intellectual exercise. Much of the “disruption” ethic in business is about changing the way we look at our business and industry. For example, what is Uber? In the U.S. it’s a “transport network company,” while in the EU it’s just a taxi company.

The metaphors we use introduce powerful framing effects. For instance, is an organization a tribe, an army or a family? Depending on the answer, we get a different set of strategic opportunities and obstacles. When it comes to governing our data analytics, we will do well to use a few different metaphors. J. Edward Russo and Paul J. H. Schoemaker wrote about the impact of metaphors in Winning Decisions:

“Experienced decision-makers choose metaphors carefully to highlight important facets of the situation at hand, helping them think about the current situation in terms of another one that they understand better. Amateur decision-makers, on the other hand, automatically use one or two metaphors to frame almost everything. In doing so, they limit the options they can see, sometimes excluding the best ones from consideration.”

We need to get smarter about how we conceive of our journey to being data-driven. Is it by chauffeur, as a herd, or something more befitting our vision? Choosing more effective strategic metaphors is an important decision for boards to consider.

Water is a good metaphor for data

A useful place to start is to substitute the word ‘data’ for ‘water’. Data is abstract and ephemeral, while water is concrete and a persistent presence in our lives. We might not care about data, but we definitely care about water. Swapping data for water helps us question some of the terms that have crept into our vocabulary. What is “big water?” Do we want our organizations to be “water-driven?” How do we really feel about “monetizing our water assets?”

Do we want our water to flow freely and irrigate our organizational networks, or to sit stagnant in ponds? If you’ve ever drank stagnant water, you know what the answer is. The data-water analogy also helps us get our heads around the massive scale of our data world. While the level of global water budget is fixed until we can figure out how to make it on scale, we can use water models to visualize exchanges of data within our organizations and networks.

The data-water analogy helps boards remember that data is both “something” and “about something.” Boards need to govern both qualities simultaneously. With this in mind, we can get more insight on what a common data metaphor really means for our organizations.

A “data lake” is a huge amount of raw data in its native format that floats about doing not much until it’s needed. As my fellow IDG contributor Paul Barth wrote:

“…90% of Hadoop data lakes never make it to production scale because they were designed as analytical sandboxes for a small number of data scientists. Lakes that do make it into production often take an elite team of programmers years to build and a small number of specialists remain a bottleneck to agility.”

Every metaphor has its limits

Paul points out that data lakes are heterogeneous. The data lake term tells us that data is ‘something’, but doesn’t cover the ‘about something’ part well. Each molecule of water in a terrestrial lake is fairly similar to the other molecules. But the information encoded in a data lake might consist of a set like this: “copper,” “ill-fated,” “neat,” “branch,” “tenuous,” “decisive,” “explain,” “truck,” “unite,” “educated,” “hum” and “notice.” For most organizations their data lakes are actually more like the Madagascar mega-dump. It’s all data, but everything is thrown together.

Data as a mega-dump isn’t a metaphor most organizations aspire to. But current management doctrine results in throwing more data onto the enterprise data mega-dump. Data is valuable, right? We want to be data-driven, don’t we? We want this big data thing working for us, right? Then surely what we want to do is get more data. So, we hook up too many data feeds that we just make stuff up and call it evidence based.

More data complexity is just around the corner

A final metaphor that will help boards get their collective heads around the complexity of data is light. Wave-particle duality isn’t the sort of thing many people get excited about, but if you’re serious about understanding how to set rules for the use and management of data you’re going to have to read a few books on physics. And this isn’t about sounding clever. Quantum computing and a quantum internet will be upon us before we’re ready for it. If we’re having trouble getting our heads around the blockchain, then we’re going to explode when it comes to thinking about qubits.

Wave-particle duality captures the dual identity of data well. Thinking about wave-particle duality helps us get used to the concept of data as something and data being about something. With the move towards sending information over electromagnetic waves in the visible light spectrum, we will get more used to associating data with light. The main idea here is that the concepts of data as something (i.e. light as a wave) or about something (i.e. a particle), are insufficient without one another. To help boards understand how to set the rules for the management and use of data, we need to think of these concepts in tandem.

If your brain hurts at this point, remember that all of us are to some extent cognitively lazy. We handwave complexity away, encourage convergence in our group decision-making and fill our business reports with evidence that confirms our view. But good governance of data demands that we make an effort to embrace complexity, uncertainty and the dynamic. Boards have to make the effort to understand the value, risk and constraint qualities of data and make reasoned choices about how to manage it all.

Boards need to take care with the data metaphors they are presented with. Thinking in terms of data as water, a mega-dump or light will help expand our frame and in so doing give us a chance to improve our management doctrine and practices.