Cloud Agility and Autonomous Operations: How AIOps Works From Edge to Cloud

BrandPost By Sandeep Singh
Nov 01, 2021
Artificial IntelligenceCloud ComputingEdge Computing

istock 1003535728
Credit: istock

Transportation systems are significantly safer and less apt to break down than just a few years ago — and the AI behind that change is driving this trend. Technology has evolved from basic controls to advanced features such as adaptive cruise control, collision mitigation braking, and sophisticated navigation that has greatly improved both traffic safety and user experience. Today’s cars are equipped with sensors and cameras that feed data to an on-board processor that leverages the cloud for software updates and feature upgrades, keeping vehicles out of repair shops and safely on the road.

Modern cars are basically mobile computers, or edge systems that host a combination of sensors and distributed processing capabilities situated at the network periphery where they gather intelligence locally. Relevant data is brought to the cloud for analysis and to train machine learning (ML) models. Refined models are then deployed back to the vehicle to enable actions based on real-time sensor data and embedded AI, a self-training method that will someday enable fully autonomous vehicles, using an end-to-end AI pipeline from edge to cloud.

Over in IT, it’s time for infrastructure vendors to deploy the same strategy and unlock cloud operational agility with fully autonomous operations. As I highlighted in a previous article, edge-to-cloud AI is a critical component of an AI-driven infrastructure. I briefly alluded to an architecture that perfects this same process: collecting data from storage infrastructure devices, using this data to train ML models in the cloud, and deploying the final ML model back to the devices. In essence, it’s an enterprise storage system with embedded AI together with machine learning in the cloud—forming an autonomous AIOps framework that stretches from edge to cloud.

How does AIOps work from edge to cloud?

Traditional ITOps platforms are either confined to storage systems with siloed intelligence or SaaS portals anchored in the cloud with no context about your infrastructure environment. The siloed systems lack the global context, and the SaaS portals are simply monitoring dashboards with no embedded AI onboard to predict disruptions based on real-time telemetry data. Point solutions such as these fail to look at the big picture.

An AIOps framework integrates IT elements and automates operations, providing an AI-driven infrastructure with the agility of the cloud. Let’s map the essential ingredients back to the transportation analogy to clearly demonstrate how such an end-to-end framework might work for infrastructure.

  1. Learning at the edge. Analogous to collecting data from on-board sensors in a car, the enterprise system collects telemetry data from the IT stack — storage, servers, virtualization software, and applications. Most IT vendors provide access to performance logs that can be used for troubleshooting problems after they occur; but this is akin to taking your car to an auto repair shop after the engine fails. If, instead, your system could learn the local workload patterns in your enterprise and detect any anomalous behavior, it could predict possible failures and help you avoid unplanned disruptions rather than sending you off to the “repair shop” after the fact.

  2. Versatility of the cloud. To say that cloud has enabled use cases never thought possible would be an under-statement. Just as the cloud has played a pivotal role in products as sophisticated as self-driving vehicles, it plays an equally critical role in bringing an AI-driven experience to enterprise IT. Besides scale and elasticity, cloud offers vendors several unique advantages — a single, non-siloed repository for exploratory data analysis, the convenience of training and iterating on different ML models, and the ability to run simulations and conduct multivariate analysis. These are all essential factors in best-in-class AI.

  3. Decision-making at the edge. In recent years, edge devices have evolved from rather simple data collectors to more engaged “decision-making” devices. Edge processing in cars automates changes to vehicle operation — acceleration, turning, breaking — based on real-time sensor data. Similarly, based on the global learning built into embedded AI models, enterprise IT systems continuously watch out for impending failure events in real time and ensure minimal-to-no disruption to your applications 24/7. This intelligence and continual decision-making allow systems such as HPE Alletra, for example, to guarantee 100% availability.

The business impact of an end-to-end AIOps framework

As with any AI, an AIOps framework of this kind works behind the scenes. But the benefits specifically suited to your environment are quite visible.

  • Improved uptime – Downtime is never good. But the only way a storage system can avoid downtime is through visibility, learning, and applying that knowledge in a way that can anticipate future episodes. Bringing the relevant telemetry data to the cloud and correlating within these datasets, an AIOps platform such as HPE InfoSight can deliver recommendations that are contextual to your environment. In this way, an AIOps framework truly improves uptime for enterprise systems.

  • Business agilityThere’s no substitute for agility in the modern enterprise. Every data-driven business needs IT to support their ever-evolving needs. Constantly changing workload patterns and an endless stream of new applications can result in performance bottlenecks and slow response times, unless your data center is equipped to handle whatever is thrown at it. By capturing performance-related telemetry data, an AIOps framework ensures optimal infrastructure performance at all times — which keeps your business in business.

  • Autonomous operations – Arguably, the hardest part of AI is not merely finding the problem but closing the loop and automating the recommended action. In most AI-driven products, this final step is left to humans. Automating the action is the single most important difference between a truly autonomous operation and one that simply incorporates AI. An automated action can only be allowed if there is confidence in the AI and precision in the action itself. Such confidence and precision are outcomes of collecting real-world data and simulating actions in multiple iterations, spread over years. For instance, a storage system with billions of TBs of data gathered over many years can reliably predict infrastructure operations and perform local, real-time actions related to system saturation and performance based on workload profile predictions. It can also adjust the priorities of background tasks to minimize impacts to actual customer workloads.

Some AIOps frameworks can automatically identify the best system for deploying an app workload, accelerating app deployment. HPE offers intent-based provisioning, which determines where data should be stored across your entire fleet without the need for storage expertise – with real-time context provided by HPE InfoSight to identify resource headroom and app-specific SLAs.

The industry’s leading AIOps platform

You wouldn’t compromise on safety in cars. Why should you compromise mission-critical workloads in your data center? An end-to-end AIOps framework is now a necessity for enterprise IT. HPE delivers an industry-leading cloud operational experience with the HPE GreenLake edge-to-cloud platform to unlock business agility for customers. HPE InfoSight, the industry’s leading AIOps platform, powers that experience. With more than a decade spent paving the way to this new era of autonomous operations, HPE InfoSight has saved more than 1.5 million hours of downtime for our customers. That’s a lot of “auto repair” time saved, so you can truly enjoy the ride!


About Sandeep Singh

Sandeep is Vice President of Storage Marketing at HPE. He is a 15-year veteran of the storage industry with first-hand experience in driving innovation in data storage. Sandeep joined HPE from Pure Storage, where he led product marketing from pre-IPO $100M run rate to a public company with greater than $1B in revenue. Prior to Pure, Sandeep led product management & strategy for 3PAR from pre-revenue to greater than $1B in revenue – including four-year tenure at HP post-3PAR acquisition. Sandeep holds a bachelor’s degree in Computer Engineering from UC, San Diego and an MBA from Haas School of Business at UC Berkeley.