Enterprises should abandon traditional ETL approaches and strategies and adopt future state architectures that distribute the data transformation burden. Credit: Thinkstock As enterprises move ahead in their digital transformation journey, the massive trail of data from digital transactions is growing steadily, but for many organizations extracting intelligence from data remains a pipe dream. According to analyst firm IDC’s Worldwide Global DataSphere Forecast, 2021–2025, business and consumer data has been amassing at a compound annual growth rate (CAGR) of about 23% since last year, with a 28% CAGR attributed to enterprises, and is expected to reach 180 zettabytes by 2025. Data created in the cloud is also growing at 36% annually, while data collected at the edge through various IoT and sensing devices is growing at 33% annually and will make up 22% of the total global datasphere by 2025. For enterprises, the task of making data compute‐ready becomes more complex as the amount of data grows, and companies are spending little time and effort developing effective data management processes and platforms to make that data easily actionable. For example, many companies collect massive amounts of digital transaction data pertaining to their customers, orders, product use, install base, service tickets, crash logs, and market intelligence, but have no good way of creating a 360 degree view of each customer or their business—despite having more technology choices available to them than ever for extracting intelligence from data. SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe Many enterprises have reached a state where it is clear to them that the amount of data they possess neither provides a constant competitive advantage nor allows them to easily unlock value from it. At the same time, this expanded data ownership raises confidentiality concerns and enforcement costs and also adds to complexity of the environments. Toward a better data management strategy Current state architectures are a result of amassing data without first developing a strategy for effectively and intelligently using that data, implementing a complex mix of technologies and fragmented processes, and relying on data engineering practices that are based on a very weak data foundation. For the most part, these foundations are based on the extract, transform, and load (ETL) method—extracting data from a number of sources, transforming the data into a specific format via an ETL server, and then loading the data into a data warehouse where it can be analyzed and hopefully presented as business intelligence. However, the data transformation process can be somewhat complex and compute intensive as it is translated into a format that can be recognized and used by a line of business databases. It can also take significant time since the process involves a lot of I/O activity, string processing, and data parsing. A better data management strategy starts with shuffling the letters “ETL” a bit and employing a process that begins with the extraction of the data, then loading it into specific data repositories that individually transform the data into a more useful and relevant form. This ELT methodology loads the data into your target system before transforming it, shifting those duties to individual cloud‐based data warehouses. Instead of using a single ETL engine/server to transform all the structured and unstructured raw data, with an ELT approach segments of the data are channeled to specific cloud data warehouses where those portions are individually transformed. The result is less I/O time and faster parsing. Less chaos, more intelligence Future state data architectures, based on an ELT structure, will focus on building a strong data foundation layer and a platform‐based approach to provide an all‐encompassing data management solution for the entire organization. Whether it is IoT data, clickstreams, sales and marketing intelligence, business metrics, or user analytics, future architectures will rely on a cohesive platform to reduce the gap between the acquisition of data and unlocking value. Some of the key considerations for the future state architecture are: Implementation of foundation layer capabilities, including connectors, event streaming, source writebacks, and MapReduce. A next layer will be comprised of data management lifecycle, data modeling, schema enforcement, data privacy, governance, consents, security, data projects, and stewardship. At the heart of this architecture is a discovery and self‐learning engine that can crawl and retrieve data from various sources in the ecosystem—constantly adapting to changing business needs and ingesting the right amount of compute‐ready data. To meet the realities of complying with data privacy regulations, data structure and persistence abstraction is required to provide solutions for data residency. Sequoia Capital The end goal of future state architectures is to eliminate long‐running queries and joins with business data by acquiring data elements that are compute‐ready and lead to the optimal usage of data storage and processing resources. This will not only reduce the amount of data stored to a fraction of what we store today, but will also increase the speed at which businesses can unlock useful and actionable business intelligence. Related content opinion Continuous learning gives U.S. Bank a technology talent edge Our embedded learning environment takes aim at employee retention and development, helping workers invest in themselves while creating a stronger technology-to-business connection. By Dilip Venkatachari Sep 01, 2023 6 mins Financial Services Industry IT Training opinion Rethinking data analytics as a digital-first driver at Dow On the road from data to business intelligence, Dow discovered a good first step is not only looking at where information might be used but how it can be woven into the cultural fabric of the company. By Chris Bruman Aug 11, 2023 5 mins Data Management opinion How CareSource IT is addressing data interoperability challenges in healthcare Investments CareSource has made in providing increased visibility into patient data and access across disparate clinical systems has paid off in improved overall ratings. But there is a long road ahead toward the interoperability that the healthcare By Tom Hurd Jul 03, 2023 7 mins CIO 100 Healthcare Industry Data Management interview DaVita’s technology strategy driven by the ‘power of purpose’ Kidney disease treatment may be this organization’s core business, but prevention and prediction through AI and data analytics are its passion, notes CIO Alan Cullop. By Tim Scannell Dec 13, 2022 9 mins IT Leadership Podcasts Videos Resources Events SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe