Data analysis is widely acknowledged as a key technology behind digital transformation. The desire to create new digital services and to react faster to information often drives new analytics and AI initiatives.
Data analysts tend to focus on data, and how it can be used to solve business problems We always hear requests for more data and higher quality data. This data-centric focus has led to centralized architectures for analytics, especially since most data historically originated on premises.
Over time, more data has originated off premises, driven by cloud computing and edge devices. Our natural tendency is to stream that data back to a central location for analysis. This drove the development of data streaming architectures like Lambda and Kappa, and calls for ‘fast data’ in addition to ‘big data.’
Meanwhile, research firm Gartner estimates that, by 2025, 75% of all data will be created or processed at the edge, outside traditional data centers. Our centralized analytics architectures cannot evolve fast enough to meet this challenge, especially with the expected data volumes and speeds of 5G technology.
One of the pillars of digital transformation is a radical rethinking of the way we view and use technology in business. A data-centric view of analytics leads us to think about data being ‘created’, especially at the endpoints and edge of the network. An alternative view is to look at what is behind the data creation.
The reality is that the data is just an artifact. Something occurred to generate that data ― a sensor sampled motor speed, a customer used an ATM, or a product was added to a shopping cart. These examples are all events, and a recent blog series from Confluent highlights the impact of an event-first perspective on systems architecture.
The same architectural challenges raised in that series of posts also apply to analytics. Analytics architectures need to evolve faster to meet agile business needs. We also need analytics to react faster to events, especially at the edge.
There are emerging architectures for analytics that look at things from an event perspective, and these architectures are beginning to have a fundamental impact on the way we deploy and scale analytics. What do these new analytics architectures look like? Here some examples of emerging patterns:
- Machine learning models are looking at and reacting to events, not data. These events are being streamed and buffered in message queues, available to any model that is interested in them.
- Machine learning models are deployed at the edge, often as microservices, sometimes embedded in devices. They can react quickly, locally, and in a distributed fashion, and can generate new events for other models to process.
- Models execute and evolve independently of each other. The loose coupling of the event model reduces model complexity and enables cloud native development and deployment.
- Model training can be centralized, distributed around the edge, or a combination of these two. Edge training can reduce data transfer and reduce turnaround times, while centralized training can provide more accuracy on compute-intensive clusters with larger data sets.
- The overall development and deployment flow looks more like cloud-native architectures than classic monolithic architectures. We are seeing the beginnings of a distributed core for analytics.
To get started on a journey to event based analytics, The Dell EMC Architecture Guide for Real-Time Data Streaming provides guidance on implementing a full Confluent Enterprise stack, while our AI Ready Solutions hub is a good starting point for machine learning, deep learning and artificial intelligence deployments.
To learn more about unlocking the value of data with data analytics solutions, explore Dell EMC Ready Solutions for Data Analytics.
Michael Pittaro is a distinguished engineer for Cloud and Big Data Solutions at Dell EMC.