Modernizing the data warehouse

BrandPostBy Keith Shaw
Aug 17, 2021
Technology Industry

Growing data volumes and use cases are straining the traditional data warehouse. A cloud-based approach can provide much-needed scale and flexibility.

istock 1226864145
Credit: iStock

Years of exponential data growth, evolving business needs, and rising maintenance costs have put a strain on existing data infrastructure. The traditional data warehouse, with its inability to handle data from new sources or handle new innovations such as machine learning or predictive analytics, requires a makeover. What’s needed is not just a new coat of paint or larger data center space – the fundamental architecture needs to change to provide the flexibility organizations need to support current workloads and prepare for future data-driven innovations.

Modernizing the data warehouse begins with moving to the cloud. This can involve migrating an existing data warehouse, but a better approach may be starting with a use case that is not well-served by the current infrastructure. “For example, many organizations decide they want to do something with machine learning to reduce their customer churn, so they run a proof-of-concept project,” says Rahul Pathak, Vice President for Analytics at AWS. “That helps them understand how to work with the cloud and how to manage data in the cloud. Then that success leads to more momentum, which may then bring in legacy processes.”

One of the biggest reasons companies decide to modernize their data warehouse is because of performance issues. When data volumes grow, or organizations make analytics and reports available to more users, they end up having to choose between slow query performance or expensive and time-consuming upgrades. Some IT teams may even discourage adding data, users, or additional queries to protect existing service-level agreements.

A cloud-based data warehouse makes it easier to scale without performance trade-offs, as companies can provision the resources they actually need and then scale capacity up or down as business needs change.

One common misconception, Pathak says, is that a cloud-based service will be slower than an on-premises data warehouse due to network performance and latency in the Internet backbone. “Given the advances in network performance and network latency, and the backbone investments we’ve made, network performance to AWS is great,” says Pathak. “We’ve also made dramatic improvements in silicon, in our compute layers, and in our software. These innovations often outpace what customers can do in their own data centers, because in our case we refresh things continuously.”

To further boost performance, AWS recently released  AQUA (Advanced Query Accelerator), a new distributed and hardware-accelerated cache that allows Amazon Redshift queries to run up to 10 times faster than other cloud data warehouses by automatically boosting certain operations, Pathak says, adding that AQUA accelerates scan, filtering, and aggregation operations today, with more operations coming soon.

A cloud-based data warehouse, deployed as part of a modern data strategy that includes data lakes, purpose-built data stores, and a unified governance approach, will position organizations better to adapt to changing markets and customer needs. “This approach will endure because it’s the right way to build things,” says Pathak. “It’s decoupled. It lets you scale. It gives you option value for the future, and lets you bring in new technology choices.”

Learn more about ways to reinvent your business with data.