How to make your new cloud-first strategy work with full-stack observability

Aug 09, 2022
Cloud Native

By Antoine Le Tard, Global Vice President for APJC at Cisco AppDynamics

Credit: Shutterstock

Over the past decade, an ever-growing number of organisations have taken their infrastructure and applications to the cloud, delivering noticeable results impacting the bottom line and several other business metrics. This is why today, a cloud-first strategy is rightly recognised even by many non-IT corporate leaders as the catalyst for rapid digital transformation and a key enabler for businesses to respond to constantly evolving customer and employee needs.

By re-thinking their approach to applications – in either cloud-only or hybrid environments – organisations can introduce greater flexibility and freedom to their application development processes, unleashing innovation on a grander scale and speed. However, as anybody who has worked in an IT department over the past year or two knows, managing availability and performance across cloud-native applications and technology stacks is a huge challenge.

Traditional approaches to availability and performance were often based on long-lived physical or virtualised infrastructures. Ten years ago, IT departments operated a fixed number of servers and network wires; they were dealing with constant and static dashboards for each layer of the IT stack. The introduction of cloud computing has added a new level of complexity, and organisations found themselves continually scaling up and down their use of IT resources based on real-time business needs. Monitoring solutions have adapted to accommodate deployments of cloud-based applications alongside traditional on-premises environments. The reality, however, is that most of these solutions are not passing the stress tests as they were not designed to efficiently handle the dynamic and highly volatile cloud-native environments that we increasingly see today. 

These highly distributed cloud and hybrid systems rely on thousands of containers and spawn a massive volume of metrics, logs and traces (MLT) telemetry every second. And currently, most IT departments don’t have a monitoring solution to cut through this crippling volume of data and noise when troubleshooting application availability and performance problems caused by infrastructure-related issues that span across cloud and hybrid environments. 

Cloud-native observability solutions are necessary

In response to this spiralling complexity, IT departments need visibility across the application level, down into the supporting digital services (such as Kubernetes), and into the underlying infrastructure-as-code (IaC) services (such as compute, server, database, and network) that they’re leveraging from all their cloud providers. They also need visibility into the user and business impact of each resource to prioritise their actions. This is essential for IT teams to truly understand how their applications are performing and where they need to focus their time.

Technologists are increasingly recognising the need for full-stack insights and to map relationships and dependencies across siloed domains and teams. This explains why, according to an AppDynamics report, The Journey to Observability, more than half of global businesses (54%) have now started the transition to full-stack observability, and a further 36% plan to do so during 2022.

IT teams need new cloud-native observability solutions to manage the complexity of cloud-native applications and IT environments. They require a way to get visibility into applications and underlying infrastructure for large, managed Kubernetes environments running on one or several public clouds. 

From a technology perspective, there are numerous key criteria that IT leaders and their teams should be considering when looking at cloud-native observability solutions to ensure they are future-proofed. They should be seeking out a solution that is able to observe distributed and dynamic cloud-native applications at scale; a solution that embraces open standards, particularly Open Telemetry; and that leverages AIOps and business intelligence to speed up identification and resolution of issues and enable technologists to prioritise actions based on business outcomes.

Organisations must have a new cloud-native mindset

Besides choosing the best cloud observability solution for the enterprise overall, IT managers must also make sure their solution delivers value to the emerging cloud specialists in their team, such as Site Reliability Engineers (SRE), DevOps and CloudOps. And not only do these technologists have new and highly specialised skill sets, but they also have very different needs, priorities, mindsets, and ways of working.

Traditionally, ITOps teams have always been focused on minimising the risks brought about by change. Their mission has been to maximise up-time and unify technology choices, and they tend to take a rigid, centralised approach to digital transformation. 

But when it comes to SREs, DevOps or CloudOps teams, it’s a very different story. These new teams value agility over control and focus on giving each team the freedom to choose the best approach. They accept that there will always be massive complexity with cloud-native applications, but they see that giving up some level of control gives them speed and innovation. They can find peace in the chaos by adopting new solutions that allow them to cut through complexity and data noise and pinpoint what matters.

Similarly, when considering digital transformation initiatives, these teams aren’t unnerved by the scale and complexity involved in these programs. They don’t feel held back by legacy technology or scarred by previous attempts to innovate. They embrace change rather than resisting it and see transformation as an exciting and welcome part of business as usual. 

These new cloud-native technologists are unwilling to conform to vendor lock-ins; they believe they can deliver the most value within dynamic technology ecosystems, with all teams having the freedom to select and work with best-in-class solutions for each project. 

Finally, cloud-native technologists (be they SREs, DevOps or CloudOps) will evolve to have a very business-focused mindset. They will increasingly strive to view IT performance and availability through a business lens and to understand how their actions and decisions can have the most significant impact on the business. 

The important thing for business leaders is to recognise the new mindsets and drivers of their cloud-native teams and empower these technologists with the culture, support, and solutions they need to deliver value. That means developing a strategy that enables these teams to operate in entirely new ways, while also ensuring their existing teams can continue doing the vital work they’re doing by monitoring large parts of their IT infrastructure.

IT leaders should consider these cultural factors when selecting a cloud-native observability solution to ensure their SREs, DevOps and CloudOps teams have a solution that offers them the scalability, flexibility and business metrics they need to perform to their full potential. 

By taking a holistic approach, considering both the technical and cultural needs of their IT teams, organisations can empower their technologists to cut through the complexity of cloud-native environments and deliver on the promise of this exciting new approach to application development.