The world’s data has been growing exponentially for the past several years. Mobile computing, IoT machine data, social media, log files, and clickstreams generate huge volumes of data in a variety of formats. The size, complexity, and varied sources of the data means the same technology and approaches that worked in the past don’t work anymore.
Older IT infrastructures have become costly to maintain, update, scale, and secure. They tend to contain data silos that are complex and time-consuming to integrate. As a result, information is often inaccessible to users outside a given department, causing reporting gaps or bottlenecks.
These environments obscure the end-to-end visibility businesses require to derive accurate insights from their data and use it strategically. Running analytics against data that’s incomplete or stale can cause faulty decision-making or missed opportunities.
The global pandemic underscored what many IT leaders have been working toward for some time: Businesses need IT infrastructure that is flexible, agile, and adaptable to rapid change. Adopting a highly flexible data architecture, leveraging the cloud for storage and compute needed to manage massive data volumes, will put organizations in a better position to address the next disruption.
Attributes of a modern, cloud-based data infrastructure
Adopting a modern, cloud-based data infrastructure means moving from antiquated, monolithic apps that run on one-size-fits-all relational databases to highly distributed, microservices-based systems running on multiple, purpose-built databases. It also means moving from on-premises and old-guard legacy data warehouses to open and flexible data lakes and “lake-house” architectures. They should all offer visibility into unified data, high levels of automation, and company-wide accessibility balanced with governance.
A data lake, for example, helps solve the fragmented data issue by storing all types of data in a central repository that allows organizations to store all their data as-is, without having to first structure it. IT or business users can run different types of analytics on that data—from dashboards and visualizations to big data processing, real-time analytics, and machine learning—to improve decision-making. A properly constructed data lake supports unlimited amounts of data as well as open, standards-based formats that let data scientists, app developers, business analysts, and others use a variety of different analytics tools.
Specifically, a modern, cloud-based data infrastructure has the following characteristics, says Herain Oberoi, Director of Product Marketing, Databases, Analytics, and Blockchain at Amazon Web Services (AWS):
- Storage that’s low cost, elastic, and reliable—traits usually derived by deploying storage in the cloud
- Security and governance that adheres to government regulations pertaining to your industry and geography about where you’re allowed to store data
- A holistic, integrated view of your operational databases, data warehouses, and data lakes
- A broad set of analytics capabilities, including the ability to process streaming data in real-time and machine learning
- Dynamic detection and response to changes
- Automation, including continuous learning and adaptability
- A catalog of all your organization’s data that makes it easily discoverable; governs access to that data; and tracks where it’s from, who owns it, and the workflows associated with it.
- Fully managed services for an organization’s most common open-source databases or analytics deployments
In addition to forming an intelligent analytics platform, Oberoi says, these traits slash costs associated with software licensing fees, hardware infrastructure and maintenance, application development, and administrative overhead. Importantly, capabilities like automation and a managed services model free up in-house teams to focus on high-value activities.
Staying fit with the cloud
A modern, cloud-based data infrastructure has helped fitness brand Equinox quickly scale its business and launch new products and services. Earlier this year, the rollout of a new consumer fitness platform and app called Variis was fast-tracked when COVID-19 hit, as the company anticipated a surge in demand for at-home fitness programs, says Elliott Cordo, VP of Technology Insights at Equinox Media.
The company had previously migrated from a Teradata/SQL Server environment to a highly scalable Amazon Redshift data warehouse and Amazon S3 data lake. The learnings from deploying this cloud-native infrastructure helped Equinox Media quickly build out and scale up Variis.
Leveraging a highly scalable, serverless data platform with services such as AWS Lambda, Amazon DynamoDB, and Amazon S3, “we ramped from 100 beta users to hundreds of thousands in a matter of weeks,” says Cordo. The ability to handle explosive growth clearly demonstrates the advantages of modern data engineering and cloud-native design, he says.
Cordo says Equinox built Variis “at a very predictable cost” and explains that the platform aggregates partner data, account data, and individuals’ workout metrics from exercise cycles and, soon, wearable monitors. That data can be made available to Equinox members in “well under a second of events occurring,” he says.
Necessary to compete
The static nature of traditional data architectures makes it tough for businesses to compete with newer companies that started as cloud-native organizations, notes Oberoi. “Those [cloud-native] businesses took advantage of cloud economics for storage, rapid provisioning, and near-infinite scalability from the get-go,” he says.
Enterprises running older platforms need to achieve the same agility so that they, too, can quickly access the data that empowers them to act quickly and address whatever opportunity or disruption may come next.
Learn more about ways to reinvent your business with data.
For more data and analytics insights from Herain Oberoi, Elliott Cordo, and other experts, check out the new Ahead of the Pack podcast.