5 steps to building a modern data foundation

BrandPost By Joanie Wexler
Aug 17, 2020
Data ManagementSecurity

Using data as a springboard for innovation requires the ability to see, analyze, and act on data across the enterprise. Hereu2019s how you can get there.

istock 1150396063
Credit: istock

Organizations that once stored gigabytes of data now find themselves having to manage petabytes or even exabytes across their IT infrastructures. The borderless Internet, which allows applications to reach nearly anywhere, is one reason for the unprecedented data growth. Perhaps an even bigger driver is the increasing use of the public cloud for highly accessible and cost-effective compute and storage services.

To harness the value of their massive data volumes, companies are building modern, cloud-based data infrastructures that help them create a universal version of “truth.” These foundations unify siloed pockets of data for a holistic view, empowering everyone in an organization to make better-informed decisions and act with confidence.

“Exponential data growth has been happening for a while, but now there’s so much more that can be done with it,” says Herain Oberoi, Director of Product Marketing, Databases, Analytics, and Blockchain at Amazon Web Services (AWS). “Cloud economics have removed the constraints of having to decide what data to store and what to discard. Now, you can keep and process it all in real time and take immediate action on it.”

Creating a ‘flywheel’ framework

AWS outlines five fundamental steps to building a modern, cloud-based data foundation. intended to help you get the most from your data by guiding you toward better decisions about which products to develop, how to find new revenue streams, where you might automate manual processes, and ways to win and retain customers. The framework uses a flywheel concept popularized by author Jim Collins, whereby each component feeds the others to continually drive momentum in capturing maximum value from your data.

The five steps are not linear, which gives organizations flexibility depending on their current level of data maturity. “You can start anywhere, and they build on each other,” says Oberoi.

  1. Break free from legacy databases. This step represents the “low-hanging fruit,” Oberoi says. Many organizations still have legacy, proprietary databases, which are expensive, create lock-in, and carry punitive licensing terms, he says. These issues can be resolved by moving to open-source databases. Oberoi cautions, however, that getting the same performance as with commercial-grade databases isn’t guaranteed. He advises making sure your open source database delivers the cost efficiencies you seek without causing a performance or availability hit.
  2. Move to managed services in the cloud. As open source and other database platforms begin to scale, IT time and administrative costs can grow as well. Many organizations still self-manage their databases, focusing on operational tasks such as hardware and software installation, configuration, patching, backups, performance tuning, and configuring for high availability, security, and compliance. “All that time spent administering means less time analyzing data or innovating on an application,” Oberoi says. Cloud-based, managed database services reduce time spent on this “undifferentiated heavy lifting” so teams can focus on higher-value activities, he says.
  3. Modernize your data warehouse. Traditional data warehouses don’t have the ability to effectively store and analyze the growing volume and variety of data, which leads to data being stored in multiple silos. Giving your data flywheel the push it needs for self-sustaining momentum requires a modern data warehouse approach, including a data lake, which can store unlimited volumes of data in various structured and unstructured formats. This “lake house” approach makes it much easier to catalog data, make it accessible, and analyze it across the business.
  4. Build modern apps with purpose-built databases. The days of developers building a monolithic application with a single relational database are fading quickly. Instead, developers are breaking complex applications into smaller pieces with a microservices architecture, then picking the best purpose-built database to solve each problem. This method frees the application from having to use a single, overburdened database for every use case and “delivers the high performance, scale, and agility that allows organizations to innovate faster,” says Shawn Bice, Vice President of Databases, AWS.
  1. Turn data into insights. A data lake provides a central repository for storing all types of data, as-is, at scale. Oberoi advises creating and maintaining an online data catalog to avoid the data lake devolving into the dreaded “data swamp.” “You can analyze real-time streaming data, determine operational health, and quickly diagnose and fix problems. You can also predict what might be coming instead of analyzing only what’s happened in the past,” he says.

A data lake has been a game-changer for Amazon.com. “Five years ago, we were limited in our ability to grow and analyze our business,” says Jeff Carter, Amazon’s VP of Data, Analytics, and Machine Learning. Amazon made the strategic decision to move all its data off a traditional Oracle database and into an AWS S3 data lake. “By migrating to the [cloud], we have been able to scale to meet our business needs” while lowering the cost to maintain the architecture by 30% to 50%, Carter says.

Data is one of the most valuable assets of any organization. Unlocking its value is a catalyst to positive business outcomes, from improving operational efficiencies to delighting customers. A modern, cloud-based data infrastructure provides a foundation for smarter, data-driven decisions.

Learn more about ways to reinvent your business with data.

For more data and analytics insights from Herain Oberoi, Shawn Bice, and other experts, check out the new Ahead of the Pack podcast.