With the expansion of the digital gold rush, data is moving into the spotlight and becoming a valuable source of information. Estimates are that the digital universe will continue to double every two years at least and reach 44 zettabytes by 2020, 50-fold growth compared to 2010. The sheer size of the data lake is staggering, but the million-dollar question remains; that is, how to make sense out of the data tsunami and capitalize on it.\nStorage costs keep plummeting\nThe phenomenon referred to as Moore\u2019s law has been observed for decades, and with the emergence of new technology (SSD, SW-defined storage, object storage, etc.) as well as the consolidation within the storage industry, the price spirals keep heading south with a double-digit decline year-on-year.\nIn the digital age, the real cost of data storage is no longer in purchasing hardware but in the effort and knowledge required to diligently manage digital assets. Across many industries and geographies this is made even more challenging due to increasingly restrictive requirements for data life-cycle management, country-specific privacy laws and bolstered compliance regulations for data-retention periods, greater utilization of encryption technology, and so on.\nDespite increasing spend on big data, technology companies are still groping in the dark\nAccording to IDC\u2019s research, the market for big data analytics will soar from $130 billion in 2016 to more than $203 billion in 2020, equaling a compound annual growth rate (CAGR) of 11.7 percent.\nAlthough enterprises spend a fortune on collecting, storing and managing data, only a few excel in converting raw data into actionable information. This is particularly true as far as unstructured content is concerned, which still accounts for approximately 80 to 90 percent of all corporate data.\nA recent report by Veritas concluded that 52 percent of all data currently stored and processed by enterprises around the globe is considered \u201cdark,\u201d whose value is unknown. As much as 33 percent of the data is considered redundant, obsolete or trivial, and is even known to be useless. On average, only 15 percent of all stored data is considered to be business critical. Unless enterprises take corrective action and become more considerate, estimates are that a \u201cdata hoarding\u201d culture will cumulatively lead to $3.3 trillion in avoidable costs by 2020, for managing the digital cemetery.\nCloudification: friend or foe?\nIn light of the fierce price battles, especially in the public cloud domain, and the ability for enterprises to store shiploads of data at low costs per unit, many of them feel strongly tempted to take advantage of it and swiftly transition corporate data into the cloud. While there are plenty of legitimate reasons for doing so and use cases in abundance, the decision to touch petabytes of data should be taken thoroughly.\nFirst of all, enterprises need to properly understand their data\u2019s composition in terms of content type, age, relevance and so on, and classify it accordingly. For instance, offloading dark data into the cloud is nothing else but a waste of time and money. Moreover, data will unfold gravity upon transitioning it, and this just moves the problem further away. As the digital universe expands exponentially, moving a growing data estate back will become at least a challenging undertaking, if not a real nightmare. Thus, enterprises should carefully assess, visualize and classify their data prior to embarking on a cloud journey.\nData governance\nDealing with structured data may not a big deal, but governing unstructured data is a much greater challenge than it might initially appear. With the lion\u2019s share of all data being unstructured, assessing the value of it, and identifying duplicative, confidential and sensitive information are key components when implementing datacentric business models.\nWhether ownership sits with the Chief Information Officer (CIO) or a dedicated Chief Data Officer (CDO), the data and analytics leader should work closely with their business unit peers and come up with a data governance framework that builds the foundation for all use cases. This typically includes how data is being classified, captured, refined, analyzed, managed, monetized, retained and erased \u2014 taking into account compliance and other regulatory requirements that may apply.\nEnterprises that have developed proprietary algorithms that enable them to derive business value are well-advised to consider filing a patent to safeguard their intellectual property rights.\nTakeaways\nDespite increasing spend, a lot of groundwork must still be done. Enterprises should avoid getting trapped in an opportunistic \u201cdata hoarding\u201d culture and be aware of the existence of a tipping point, at which creating even greater data silos won\u2019t necessarily lead to bigger returns \u2014 especially when keeping in mind how much of the content is \u201cdark\u201d or \u201cancient.\u201d As a matter of fact, the outcome of any big data analytics project is only as good as the quality of the data being utilized. To a great extent, this has to do with a well-implemented governance model that sets apart \u201cgood data\u201d from \u201cbig data.\u201d\nThe cloudification can make great economic sense and enable ample use cases, but it needs solid planning in order not to be led up the garden path.\nWhile it might at first be perceived incorrectly as a rather boring housekeeping exercise, putting a solid data governance model in place is indeed tightly correlated with the success of the data-savvy enterprise and follows two basic principles, which are directly related to the firm\u2019s balance sheet: gaining strategic insights to produce new digital revenue streams, and eliminating unnecessary costs for managing vast amounts of useless data.