By Ed Anuff, Chief Product Officer, DataStax
Enterprises across industries have been obsessed with real-time analytics for some time. The technology that powers this toolset that aims to make critical business decisions quickly is expected to amount to a $50.1 billion market by 2026.
It’s no surprise. The insights provided by analytics “in the moment” can uncover valuable information in customer interactions and alert users or trigger responses as events happen. And these real-time responses are a critical part of building the kind of experiences your customers expect.
But this glittering prize might cause some organizations to overlook something significantly more important: constructing the kind of event-driven data architecture that supports robust real-time analytics.
Learn more about DataStax Astra Streaming,
which is now generally available
An enterprise that focuses on building an event-based architecture for real-time applications will be in a much better position to build a real-time analytics platform. Why? Because when your application architecture is closely mapped to your business activities (so-called “events”), you produce the kind of real-time data you need to run real-time analytics in a more flexible and scalable way than traditional software architectures.
Let’s take a closer look at what real-time events mean in a digital business, and how building an open architecture to make the most of the data these events generate can create a better customer experiences and drive revenue.
All interactions are digital interactions
It’s helpful to begin by thinking about what an event is. In a business context, this is defined as an interaction. Interactions with customers, partners, suppliers – your entire value chain – are what drive business.
For a digitally transformed business, all of the interactions are digitally mediated. This is true even when an interaction happens offline, in the physical world. Think about a courier company delivering a package, or an airliner touching down 30 minutes behind schedule: these are digitally mediated offline activities.
These interactions are represented, in a technological sense, as “events,” with a certain amount of importance attributed to when they happen. We can, in the semantics of the software world, refer to digitally mediated business activities asreal-time events.
How do businesses manage and take advantage of real-time events? With an event-driven architecture: a software programming approach built around the capture, communication, processing, and persistence of these events – mouse clicks, sensor outputs, and the like. All in real-time, of course.
Processing streams of data in the moment involves taking actions on a series of data originating from a system that continuously creates events. When an airliner lands behind schedule, a wide range of real-time data could trigger actions: gate availability, fuel truck location, missed connections.
The ability to query a non-stop data stream and recognize that something important has happened or find anomalies, and act on them quickly and in a meaningful way (like booking a new flight for a passenger that’s missed their connection), requires a specific technology stack.
The foundation of an event-driven architecture
Many organizations understand the importance of event-driven architectures. Pretty much every aspect of our technological lives has been affected by the move toward event-driven, real-time data processing – the way we communicate, the way we work, the way we order food. The way businesses are run has evolved too: the availability of real-time inventory, sales, and demand data is driving real-time optimization of supply chains across industries.
Returning to the package delivery company example, every interaction – a driver scanning a package, a user looking at a mobile app, a lost package – is an operational event that a software engineer needs to think about.
It’s no surprise that the event-based paradigm has had a big impact on what today’s software architectures look like. Organizations need a stack of technologies that make real-time data – whether it’s “in motion” and streaming from IoT devices or within an enterprise data ecosystem, or “at rest” and captured in a database – available to be used in the moment.
There are some core components of a real-time data stack. They should include the ability to scale-out fast, and an elastic datastore capable of ingesting and distributing data as it streams in. Organizations dealing with real-time data streams have long leaned toward Apache Cassandra as the database of choice, thanks to its high throughput and scalability and its ability to intake and distribute data very fast.
High-scale streaming technology, such as Apache Kafka or Apache Pulsar, is another key part of an event-driven architecture. Modern data apps require streaming technologies that can deliver the reactive engagement at the point of interaction that end users have come to expect.
The open data stack
At DataStax, our goal has been to build an open data stack that enables enterprises to mobilize real-time data to build high-scale data apps – but it’s also a foundational, integrated set of technologies that can integrate with a host of other products and toolsets (including analytics platforms).
Three important components make up the stack we offer: Astra DB, a database-as-a-service built on Cassandra; Astra Streaming, built on the advanced streaming technology of open source Pulsar; and Change Data Capture (CDC) for Astra DB, which enables the streaming of real-time operational data across an organization’s data ecosystem.
A key part of our stack is the word “open” – and this brings us back to the analytics discussion. Many enterprises find that there’s an impedance mismatch between software systems that aren’t event-based and the kind of real-time analytics that produce the most valuable insights. Companies are left to struggle with stale data that can only represent a view that’s hours or even days old. As the demand skyrockets for up-to-the-moment accuracy to drive smarter, instantaneous decisions and customer experiences, the need to correct this misalignment becomes increasingly urgent.
With an open, real-time data stack, not only does that impedance mismatch problem go away, but organizations are open to integrate their platform and connect their data to any number of other technologies, platforms, and toolsets – including real-time analytics and data stores like Flink, Apache Pinot and Apache Druid to name just a few.
Flexibility is built in with an open data stack. Let’s say an organization’s data science team needs to ask a specific business question (an “ad hoc query,” in analytics parlance) of the operational data store – one that isn’t answered by predefined or predetermined datasets. Ad hoc queries are often difficult to solve, particularly on large datasets.
Yet when a stack is built with openness and real-time data driven by events in mind, it becomes relatively simple to pipe data from an operational backend into any manner of data analytics platforms. In the case of DataStax’s offerings, our recent introduction of CDC for Astra DB has essentially enabled us to embed a high-throughput, scale-out streaming capability into the database. This dramatically simplifies the ability to pipe any data, with millisecond-response times, from an operational backend (in our case, Cassandra) into Snowflake, or AWS Athena. It also makes it far easier to move data generated by analytical systems into edge datastores to help improve application performance.
In essence, an application developer doesn’t have to worry that the database they’ve chosen to power real-time user interactions is going to impede types of analytics that are necessary to drive the business forward.
Meeting new expectations
Real-time analytics is just one example of the kind of powerful tools an enterprise has at its fingertips when it builds an architecture that can take full advantage of the data generated by business events. An event-based, real-time data architecture is precisely how businesses today create the experiences that consumers expect.
Learn more about DataStax Astra Streaming, which is now generally available
About Ed Anuff:
Ed is chief product officer at DataStax. He has over 25 years experience as a product and technology leader at companies such as Google, Apigee, Six Apart, Vignette, Epicentric, and Wired.