A data lake is a massive storage repository that can hold all types of data until it is needed for business analytics or data mining. But it's not a panacea for big-data projects. 1.The concept is still quite new. The term data lake, credited to Pentaho CTO James Dixon, has been bandied about for several years. But the idea of data lakes as corporate resources is still in its infancy, according to IDC analyst Ashish Nadkarni. A data lake is defined as a massive–and relatively cheap–storage repository, such as Hadoop, that can hold all types of data until it is needed for business analytics or data mining. A data lake holds data in its rawest form, unprocessed and ungoverned. 2.You Can’t buy a ready-to-use data lake. Vendors are marketing data lakes as a panacea for big-data projects, but that’s a fallacy, according to Gartner. “Like data warehouses, data lakes are a concept, not a technology,” says Gartner analyst Nick Heudecker. “You can use several technologies to build a data lake. At its core, a data lake is a data storage strategy.” 3.Lakes have Big appetites for data. Data lakes are designed for data ingestion–the procedure that involves gathering, importing and processing data for storage or later use. “Where the storage cost model of a data warehouse may not lend itself to wholesale data ingestion, a data lake does,” Heudecker says. “Also, a data lake doesn’t require the users to create a schema before data is available for use. Data can simply be ingested and the schema created and applied when the data is read.” 4.You must involve multiple facets of the business. Data lakes are resources for the entire organization, not just IT. Therefore, all interested parties should be involved in planning data lake projects. “It is central to the firm’s big-data architecture, and therefore cannot be implemented in isolation,” Nadkarni says. In addition to IT managers, a data lake project should involve business leaders and users. Storage experts also need to play key role. “At the end of the day,” Nadkarni says, “it is a storage platform, and therefore [companies] should involve the storage team in its design and implementation.” 5.The biggest benefits don’t come from technology. The business value of a data lake has very little to do with the underlying technologies chosen, Heudecker says. “Instead, the business value is derived from the data-science skills you can apply to the lake,” he says. “Data lakes aren’t a replacement for existing analytical platforms or infrastructure. Instead, they complement existing efforts and support the discovery of new questions.” Once those questions are discovered, he says, you then “optimize” for the answers. “Optimizing may mean moving out of the lake and into data marts or data warehouses,” Heudecker says. Related content brandpost Sponsored by SAP What goes well with Viña Concha y Toro wines? Meat, fish, poultry, and SAP Viña Concha y Toro, a wine producer that distributes to more than 140 countries worldwide, paired its operation with the SAP Business Technology Platform to enhance its operation and product. By Tom Caldecott, SAP Contributor Dec 04, 2023 4 mins Digital Transformation brandpost Sponsored by Azul How to maximize ROI by choosing the right Java partner for your organization Choosing the right Java provider is a critical decision that can have a significant impact on your organization’s success. By asking the right questions and considering the total cost of ownership, you can ensure that you choose the best Java p By Scott Sellers Dec 04, 2023 5 mins Application Management brandpost Sponsored by DataStax Ask yourself: How can genAI put your content to work? Generative AI applications can readily be built against the documents, emails, meeting transcripts, and other content that knowledge workers produce as a matter of course. By Bryan Kirschner Dec 04, 2023 5 mins Machine Learning Artificial Intelligence feature The CIO’s new role: Orchestrator-in-chief CIOs have unique insight into everything that happens in a company. Some are using that insight to take on a more strategic role. By Minda Zetlin Dec 04, 2023 12 mins CIO C-Suite Business IT Alignment Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe