3 key enablers to modernizing your data-centric application estate

BrandPost By Matthew Hausmann
Jun 14, 2021
ContainersIT Leadership

istock 1263416096
Credit: istock

Speed and agility continue to drive how companies differentiate. In today’s highly competitive environment, rapid application development and deployment is essential to helping businesses react and pivot. The ability to build apps quickly allows businesses to zig and zag with ease to respond to customer needs and capture revenue opportunities from major market changes – whether that’s a global pandemic, new disruptive competitors, or supply chain interruption.   

Containers to the rescue

If you’ve been asked (or forced) to develop and deploy applications faster or to modernize your application estate, you’ve probably already heard of containers or Docker or Kubernetes orchestration. Containers have taken the world by storm and are key to enabling application modernization for several reasons. They’re portable from edge to cloud, they add agility to DevOps processes, and they bring simplicity and speed to application development and deployment. 

The good news is, if you’re just getting around to modernizing your data and application estate – you’re not too late.  ‘Application modernization’ was once reserved for cloud native and microservices-based apps, but now there’s a revolution underway to take those same techniques and apply them to your entire stateful, data-centric estate. Data scientists and IT admins have caught on to the modern trend to keep up with their rapidly changing toolkits and the fact that it often makes sense to develop in one location and deploy in another. In a recent study, 451 Research found that 94%1 of AI workloads delivered by 2022 will be deployed via self-service containers for these data- and analytic-centric workloads.

Today’s enterprise application modernization is multi-cloud for data analytics

451 Research estimates that nearly two-thirds2 of today’s current application estate still needs to be modernized.  In general, that means the easy ‘lift and shift’ work has already been done – so now it’s time to tackle the difficult clustered applications, data-centric applications, and analytic-heavy solutions. 

Another trend I’m seeing is that single location deployments, be it on-premises or in a particular public cloud, limit application agility. The new reality is multi-cloud first architecture, and 451 Research findings show that to-be modernized workloads will be split 46% on-premises and 54%3 cloud deployments. This means you need to architect for multi-cloud.    

ISVs and open source are key to modernizing the remaining 63% of workloads

So what happens when the easy ‘lift and shift’ work is done and the piecemeal parts have been modernized? Unfortunately, unless enterprises want to completely re-architect their apps on their own, the majority of the remaining 63% of workloads are dependent on external modernization efforts. They require innovation by Independent Software Vendors (ISVs) and the open-source community, the two other key enablers of app modernization, to move them from monolithic to loosely-coupled deployments optimized for Kubernetes deployments.

This trend is in full swing. I’ve witnessed ISVs like Dataiku, H2O.ai, and others spending the last few years optimizing solutions for the public cloud. They’ve recently realized that a big chunk of their customer’s workloads are actually still on-premises so they’ve changed their focus – bringing this innovation to solution providers who can deliver on-premises performance and security and provide the bridge for consistent multi-cloud deployments.

Even big names like Splunk and SAS have taken this path with their monolithic software to take advantage of modern, agile, and efficient K8s-based deployments. Splunk stands out to me because I was part of a major validation and benchmark with their beta Splunk operator for Kubernetes last year. We worked with a major US bank who couldn’t physically scale the infrastructure to keep up with data growth rates. Running the traditional deployment of a single indexer or search head per server resulted in extremely low CPU utilization, requiring massive over deployment of infrastructure to keep pace with data growth. It took weeks to deploy new workloads, and as their data centers approached near maximum capacity, indexing fell behind – creating security blind spots. As a result, the bank was forced to find ways to optimize their delivery and consumption of Splunk.

Leveraging Splunk’s new operator for K8s, we were able to modernize the delivery of Splunk using containers to take full advantage of the optimized infrastructure to drive utilization and throughput. In minutes, we were able to independently scale from 1 to 6, and up to 12 indexers per host delivering an astounding 17X indexer throughput improvement (8.7 TB/day per host) and driving CPU saturation up to 70%.

I skipped over a lot of the details that required tight collaboration between HPE, Intel, Scality, and Splunk – but the point is that the customer needed the ISV to do the work to modernize their application. This in turn allowed them to fundamentally change the way this solution is deployed. As a result, we eliminated the security data blind spot with up to 17X higher data ingestion and indexing per host, flipped the TCO model by shrinking the infrastructure footprint by up to 10x, quickly addressed new use cases deploying new indexers and search heads in minutes, and balanced hot cache and S3 object storage for exabyte scale. And we even delivered the solution as a service!

As open-source software has become a key driver for modernizing existing workloads, enterprises have really warmed to open-source components over the past decade –especially in the data science and data engineering communities. While most of the more recent tools are developed as cloud native out of the gates, many of the established and most widely deployed tools are still in the process of modernizing.  

Apache Spark is the most obvious example of this. First released in 2009, it is now a key component of most open data analytics platforms for ETL, data science, and data engineering. As companies move away from Hadoop, Spark has evolved from deploying Spark on YARN to Spark on Kubernetes.

Fast forward to 2021 and Spark’s modernization efforts have paid off. The new Spark 3.x operator is now ready for primetime — delivering native GPU acceleration capabilities, S3 integration, and significant resiliency improvements. This type of open-source modernization will go a long way to modernizing that 63% of workloads that remain – and we’re already seeing the container platforms and cloud offerings integrate in the new Spark operator to provide the surrounding components to make it enterprise grade.   

To learn more

With organizations still needing to modernize two-thirds of their current application estate, assembling the right key enablers such as containers, ISVs, and open-source software is essential and now within reach. Want to explore more about how to modernize your stateful, data-centric applications? Here are a few options to allow you to go deeper on this concept.  

Read the Pathfinder research paper from 451 Research, Application Modernization and the Age of Insight, and attend my session at HPE’s Discover 2021 on how Containers Are Driving Digital Transformation.  If you prefer pictures, you’ll enjoy this eBook, Fuel Edge-to-Cloud Digital Transformations, for real stories on how companies are successfully collaborating with HPE Ezmeral, ISV partners, and open source to modernize their data-centric applications.   

1,2,3 Pathfinder paper by 451 Research – Application Modernization and the Age of Insight, published June 2 2021


About Matthew Hausmann

Matt’s passion is figuring out how to leverage data, analytics, and technology to deliver transformative solutions that improve business outcomes. Over the past decades, he has worked for innovative start-ups and information technology giants with roles spanning business analytics consulting, product marketing, and application engineering. Matt has been privileged to collaborate with hundreds of companies and experts on ways to constantly improve how we turn data into insights.