Data Warehousing in the Cloud

Data warehousing is moving from its traditional home in the data center to the increased capacity and flexibility of cloud platforms. Make sure your organization has the training and certifications to support it.

istock 1148091793
piranka

It should come as no surprise that traditional enterprise data storage has undergone the same digital transformation as any other industry. Data warehousing technologies and techniques have always been centered around on-premises data centers and massive physical storage servers like Oracle Exadata or Teradata Extreme Performance, as well as data warehousing appliances like IBM Netezza or Smart Analytics and HP Neoview. But data warehousing is moving to the cloud, as many other enterprise functions already have. This requires new skill sets covering new technologies from cloud providers like Amazon, Microsoft, and Google, as well as business solution providers like Oracle and SAP.

Building and deploying a data warehouse involves integrating structured and unstructured data from numerous disparate sources. This aggregated data drives an enterprise’s reporting and data analysis efforts, and is a core component of business intelligence.

Organizations of all sizes across all industries continue to amass data at a staggering rate. Research firm Statista forecasts the total amount of data created will reach 175 zetabytes in 2025. And with that growth in data comes growth in value. The global market for the data and business analytics industry was valued at $168.8 billion in 2018 and is forecast to grow to $274.3 billion by 2022.

“The most common mistake is throwing out any data at this point, so make sure you have a data lake. That’s priority number one,” says Myles Brown, Senior Cloud and DevOps Advisor for ExitCertified. “After that, you have to prevent the data lake from becoming a data swamp with all kinds of data sitting there, unregulated. You have to keep track of where the data came from, where it should go, and who’s allowed to access what, so the initial setup of a data lake is a big hurdle for lot of people.”

These days, data warehousing efforts are more focused on cloud platforms. Each of the major cloud providers—AWS, Microsoft Azure, and Google—have their own data warehousing tools that work on their platform. Using tools like Amazon Redshift, Azure SQL Data Warehouse, and Google BigQuery—each built for the cloud platform upon which it runs—can help IT staff run data analytics workloads on data stored within AWS, Azure and Google Cloud platforms, respectively. The other less pervasive cloud platforms like Oracle and IBM also have data warehousing tools like Oracle Autonomous Data Warehouse Cloud and IBM Db2® Warehouse on Cloud.

Moving data warehousing functions to the cloud not only helps with storage capacity and access, but can also help with security and compliance. Cloud platforms, the supporting security technologies, and the shared security model makes the public cloud quite secure. “Once you learn more about cloud, you start realizing the extent of its security,” says Brown. “Think of Amazon—they’ve got just as much at stake as we do to ensure the cloud is secure. These days, we find security is the number one reason people go to the cloud.”

IT staff responsible for managing corporate data stores increasingly need training and certification in a variety of cloud-based data warehousing solutions, including:

Related:

Copyright © 2020 IDG Communications, Inc.