by Mark MacCarthy

Big data is not a barrier to entry

Opinion
Jan 29, 2018
Big DataIT StrategyTechnology Industry

Just as big data is not the “new oil” or “information infrastructure,” neither is it a barrier to entry. Companies have vast quantities of new and free data available to them.

Great Wall china sunset
Credit: Thinkstock

As I argued in a “Data is not the new oil or the infrastructure of the digital economy,” data is not a finite exhaustible resource like oil and it is not an information infrastructure. It is a valuable input that allows companies to improve their products and services to make our lives richer and more productive.

But in some cases, data could act as a barrier to entry. As the Economist puts it, “Vast pools of data can…act as protective moats.”  Some competition policy authorities have suggested that “….the collection of data may result in entry barriers when new entrants are unable either to collect the data or to buy access to the same kind of data, in terms of volume and/or variety, as established companies.”

In theory, one company’s control of data might in certain circumstances leave such a limited supply left for others that competition has a hard time thriving. 

But in the real world, data does not play the role of a barrier to entry in tech. As the Centre on Regulation in Europe says, “The first principle is that data are one input, which is important but not unique, to develop successful applications and algorithms.  Other inputs are also important such as skilled and creative labour force (in particular computer scientists and engineers), capital and distribution channels.” 

Data is important in a special way – processing data provides insights that help improve the quality of a service.  And the good news is that business-critical data is easily available to tech companies and to the many other industries that rely on data. The Internet itself is the source of large amounts of commercially important information freely available for anyone who wants to use it.  Service providers make enormous amounts of additional information commercially available for business uses. The flood of data is so large that companies such as Oracle have begun to set up data marketplaces to help companies find and buy what they need.

Moreover, data is not destiny.  For one thing, more is not necessarily better.  At some point having additional data adds no additional value.  Moreover, for many applications, the value of data declines quickly; it is transient.  Applications live off new data, not stockpiles of historical data. For instance, 15 percent of queries submitted each day have never been seen before by Google’s search engine. 

In addition, it is really the skills and creativity of the company’s employees that will allow data to be turned into successful applications. The start-up dating service, Tinder, overcame the data lead of established providers Match.com, eHarmony, and OkCupid with innovative features such as the “double opt-in.” As economist Hal Varian puts it, it is the recipe not the data that matters. And the recipe – the algorithm, the idea – comes from high quality data scientists and creative entrepreneurs.

Finally, it is not the amount of data that is crucial.  Data matters for competition analysis only if it is scarce and cannot be replicated.  The philosopher John Locke said it would be legitimate to acquire property in the state of nature as long as “there was still enough, and as good left” for others. In the data economy, the question isn’t whether data is valuable but whether there’s enough good data available for competitors to use.

When competition authorities have faced this context-dependent factual question in recent years in specific merger cases, time after time they determined that competitors would have post-merger access to enough data.

In 2008, the European Commission found that “the combination of (Google’s) data about searches with (Doubleclick’s) data about users’ web surfing behaviour is already available to a number of Google’s competitors today.”

In its 2014 consideration of the merger of Facebook and WhatsApp, the European Commission determined that “…regardless of whether the merged entity will start using WhatsApp user data to improve targeted advertising on Facebook’s social network, there will continue to be a large amount of Internet user data that are valuable for advertising purposes and that are not within Facebook’s exclusive control.”

In its 2016 consideration of the Microsoft merger with LinkedIn, the European Commission ruled that “the combination of their respective datasets does not appear to result in raising the barriers to entry/expansion for other players in this space, as there will continue to be a large amount of internet user data that are valuable for advertising purposes and that are not within Microsoft’s exclusive control.”

Finally, in its 2017 study of the online video streaming marketplace, the Netherlands Authority for Consumers and Markets found that “large quantities of data are not an insurmountable barrier for being able to enter the market.”

It is legitimate for competition policy authorities to examine possible harmful effects on the interests of consumers from the collection or combination of data sets. But the abundant supply of data in the Internet ecosystem and the ease with which valuable data can be replicated by competing firms suggest that the examination will routinely reveal a lack of harm.  In the real world, as opposed to the world of speculation and theory, data is not a barrier to entry.