A gap exists between those that create data and those that can benefit from that data. A company you may never have heard of, Knoema, is angling to centralize the many disconnected sources of data between the public and private sector.

There is a lot of talk about big data and how important it is to big business. Not a lot of focus has been put on how inaccessible data is--or how the sources of data are so disconnected. There are many publicly accessible repositories of legitimate, raw data but it is often useless without normalization or visualization to make the data digestible to the point an average person can gain applicable insights without knowledge of SQL or R.

We are living in an era where data is being is produced at an exponential rate and no one can stop it. With the implementation of the internet of things (IoT), the advancements in sensor technology--and mass adoption of both--there are new categories of data being created. From air quality, water quality, supply chain, food production, energy, transportation and financial data--the sources or creators of the data typically follow an outdated information-system model that results in the data being locked in a silo and generally inaccessible to other parties.

Stitching sources together

The pure inefficiencies that inherently lie in the life cycle of data are the root cause of this data accessibility dilemma. A gap exists between those that create the data and those that can benefit from that data. Knoema, the business-to-business play that is working with the African Development Bank, IHS and the International Monetary Fund, is angling to centralize the many disconnected sources of data between the public and private sector. Given that data is ubiquitous, there are several hundred publicly available datasets provided by various organizations--governments, academic institutions, MNCs and NGOs. Some major names that provide free and public access to their datasets include Facebook, Google, UNICEF, Amazon, CIA, WHO, US Census, and Canada.

While it marks one progress milestone now that some organizations have made the shift to making their data accessible to outsiders, it doesn’t translate to accessibility. The end-user looking for data faces high acquisition costs, namely finding the data relevant to a project--as well as enablement, or the ability to functionally use it.

Knoema CEO Vladimir Bougay notes that “Today’s businesses depend on data. KPMG estimated that 92 percent of С-level executives rely on data and analytics to gain greater insights into their markets and customers. And it’s completely clear why they do that: customer insights can reduce costs, generate sales growth and improve overall productivity of the company.” He added: “Getting that data isn’t simple, and hours of time wasted every week by information workers on searching, scraping, normalizing and visualizing data they need for preparing reports, doing analysis and decision making. A typical company may have dozens of internal databases and external data subscriptions available through a diverse crowd of the applications and websites. As a result, just in the US businesses waste billions every year. He’s right. One survey from IDG estimated that enterprises invested an average of $13.8 million each in data analytics.

Many of the largest organizations that naturally would build up an internal knowledge base certainly have, created over time from all of the data they have collected. More likely than not, the internal data is not sufficient to make a complete, informed decision. These companies often end up employing a team of information workers to supplement their internal datasets by researching, collecting and integrating outside data from sources--like UNICEF or WHO. Sometimes the data simply does not exist and productivity suffers as hours are lost to these menial tasks that can be automated via integrations.

Data can be ugly

Just because you’ve found what seems to be a relevant dataset doesn’t mean the insights or knowledge you seek will be evident. Sometimes datasets can look like something you would see on a computer screen in The Matrix--so unless you are like Tank who can understand those rows and columns of numbers, the data often needs to be spruced up to make it usable. Half of the battle is finding the data you need, the other half is making sense of the data to uncover the insights and knowledge buried deep within the dataset. Knoema, through its collaborations, is attempting to address both major issues by not only pulling together the various data silos, but also by offering the different tools necessary to perform various manipulations and visualizations that allows for coding-free insight discovery.

The issues revolving around data transcend all borders and are felt by most organizations in every industry. The Knoema platform attempts to address many of the pain points surrounding finding and using information to make data-driven decisions in business and in life. Have we solved one of the core problems around Big Data? Not yet. But Knoema might just be onto something.

