If you can’t make sense of your business data, you’re effectively flying blind. Insights hidden in your data are essential for optimizing business operations, finetuning your customer experience, and developing new products — or new lines of business, like predictive maintenance. Analytics is the means for discovering those insights, and doing it well requires the right tools for ingesting and preparing data, enriching and tagging it, building and sharing reports, and managing and protecting your data and insights. And as businesses contend with increasingly large amounts of data, the cloud is fast becoming the logical place where analytics work gets done.
For many enterprises, Microsoft Azure has become a central hub for analytics. Taking the broadest possible interpretation of data analytics, Azure offers more than a dozen services — and that’s before you include Power BI, with its AI-powered analysis and new datamart option, or governance-oriented approaches such as Microsoft Purview. Leaving aside more specialized options such as ingesting telemetry, sharing data externally, or building machine learning models to deliver specific analyses, there are still enough Azure analytics services that you might wonder which one is best suited for any given job.
The truth is, Microsoft aims to provide CIOs a full stack of analytics services on Azure designed to work together, rather than a piecemeal approach, says Amir Netz, CTO of Microsoft Analytics. It is geared toward IT chiefs who want to be chief data officers, not chief integration officers, he suggests.
Although there is overlap between the various services, Netz explains that Azure’s analytics services broadly correspond to the layers an organization would build in creating an analytics architecture framework, “from creating the data lake and storing the data, processing the data in the lake and doing the data engineering, the ability to build data warehouses on top of that, to run machine learning algorithms, to do data science, to serve the data to business users,” he says.
Here we take a look at Microsoft Azure’s essential analytics services, what they are used for, and how they come together to make a comprehensive stack for your analytics strategy in the cloud.
1. Azure Analysis Services
If you’re used to using SQL Server Analysis Services for business intelligence, Analysis Services offers that enterprise-grade analytics engine as a cloud service that you can also connect to Power BI. But the features in Power BI Premium are now more powerful than the functionality in Azure Analysis Services, so while the service isn’t going away, Microsoft will offer an automated migration tool in the second half of this year for customers who want to move their data models into Power BI instead.
2. Azure Data Factory
Data Factory is a service for code-free data movement and data transformation pipelines to make it easier to integrate data from various sources into data warehouses: Think ETL (extract, transform, load) and ELT (extract, load, transform) as a service with built-in connectors, but with the emphasis on transforming and enriching data rather than just moving it into the right place (although you can also use it to move data into the cloud). Data Factory includes features such as “code by example” to help users build queries but also has options to use languages such as Python, Java, and .NET with Git and CI/CD support, making it particularly useful for migrating SQL Server Integration Services to Azure.
3. Azure Data Explorer
Data Explorer is a big data analytics platform that you can use, as the name suggests, for exploring data using KQL, also known as the Kusto Query Language, from the codename for the project which may or may not be a reference to exploring your ocean of data as if you were Jacques Cousteau. Azure Data Explorer is used to store and query data in services such as Microsoft Purview, Microsoft Defender for Endpoint, Microsoft Sentinel, and Log Analytics in Azure Monitor.
4. Azure Data Lake Analytics
Data warehouses are designed for questions you already know you want to ask about your data, again and again. Data lakes, on the other hand, enable you to store structured and unstructured data to explore with new questions that you haven’t asked before. Azure Data Lake Analytics helps you extract, clean, and prepare data from Azure Data Lake using R, Python, .NET, or U-SQL (which combines SQL and C#) to write queries, with key technologies from Azure Cognitive Services included as functions for processing text, speech, and images using machine learning. This is a serverless analytics job service that can handle petabyte scale data transformation, so you pay for the job rather than needing to manage infrastructure.
5. Azure Synapse Analytics
If you want to get away from building your own analytics framework that multiple teams in your organization will then use to extract data from data lakes and build data warehouses that business users have to access and work with separately, Synapse Analytics gives you the capabilities of cloud data warehouse and data lake services but lets you run your preferred analytic engine — whether that’s SQL or Spark — over all your data, structured and unstructured, without waiting for ETL processes or worrying about where the data is stored or how to connect to it. Synapse Analytics data flows are powered by Azure Data Factory, and if you use Cosmos DB, transactions in your operational database will be mirrored and available for analytics seconds after they’re recorded, so you can explore big data and relational data together. If the questions are useful enough to ask again and again, you can formalize them with traditional analytics techniques.
6. Azure Databricks
If you want to spin up Spark clusters on demand for transforming, cleaning, and enriching your data, Azure Databricks is an Apache Spark-based big data analytics service optimised for Azure with data adapters for various data types and an interactive workspace for building Spark dataflows. You can work in Python, Scala, R, Java, or SQL, but it’s particularly suitable for building AI systems and you can use common data science frameworks such as TensorFlow, PyTorch, and sci-kit learn, plus there’s integration with Azure Machine Learning.
7. Datamarts in Power BI
Think of datamarts as relational databases designed to do analytics at the business unit level rather than the enterprise data warehouse level, frequently driven by business users who need to collect data from multiple sources and integrate it together in a lightweight way. They don’t have the skills or the budget to provision a full relational data warehouse in the Azure portal, they don’t need the petabyte scale or even terabytes of data and they’re currently using technologies such as SharePoint lists or Excel for this, making it an underserved market with less governance than CIOs might prefer.
Datamarts in Power BI Premium are a fully managed, self-service, no-code option for up to 100GB of data with workloads automatically optimised for performance and a user interface that looks like Power Query (although advanced users can write DAX or SQL queries). Datamart discovers relationships between tables and generates the dataset, combining the semantic model of Power BI with the relational database model.
“You don’t need to know anything about how to be a DBA. We don’t ask you about partitioning scheme, or how to create an index,” Netz explains. “You don’t need to know how to write SQL and to import data or to query. Everything is visual. Everything is easy to use. Everything is designed for the user who knows how to create Power BI reports.”
Bonus: Azure Stream Analytics and Azure Time Series Insights
There are new ways to use analytics for which the cloud is particularly well-suited. Traditional analytics focuses on data in databases, but with sensors and IoT devices, you have transient, time-sensitive data that you want to process and take action on almost in real-time. The same is true for the clickstream form web and mobile apps. Azure Stream Analytics enables you to look at the data as it streams in and process it immediately to discover whether you want to take action. This processing wants to be done close to where the data is ingested, so you’d use Event Hubs to collect the data and pass it to Stream Analytics. You can also aggregate data to reduce the amount being stored and query it later to analyze trends or to forecast demand so that you’re storing, say, a moving average of the last second rather than recording the temperature every millisecond.
If you don’t want to build your own stack for that kind of analytics: Azure Time Series Insights is an end-to-end platform that takes data from IoT devices for you to monitor, analyze, visualize, and act on. You can use it to spot trends, highlight anomalies, and dig into root causes, and because Azure offers a full stack of analytics services, you can feed that data into other services such as Azure Databricks or use it to make models with Azure Machine Learning. But the market is moving away from single-purpose services and Time Series Insights is being deprecated in favour of Azure Data Explorer.