In 2012 Geoffrey Moore tweeted, \u201cWithout big data analytics, companies are blind anddeaf, wandering out onto the Web like deer on a freeway.\u201d \nFast forward a decade and a lot happened in the 2010\u2019s to deliver sight and sound. The storage industry brought innovation to solve the petabyte+ data challenge, the analytics software\/toolkits ecosystem rapidly matured, and chip manufacturers delivered accelerated compute to glean insights from the ever-growing troves of data.\nBut the quest for better insights is never over. In fact, the constantly increasing volume of data is forcing us to take analytics into hyperdrive. For the enterprise to stay competitive in 2021, they must continue to innovate. Below I describe four big data analytics trends I\u2019m seeing, along with some suggested solution features to look for.\u00a0\n\nApache Spark will continue to dominate the big data world\n\nThe classic data scientist is known as a badass; give her Apache Spark software with a Jupyter notebook and get out of her way. Apache Spark, a unified analytics engine for large-scale data processing, is now the Kleenex of big data analytics and data engineering. It\u2019s ubiquitous \u2013 universities offer classes for it, every Hadoop deployment is leveraging it, the new Spark 3 operator brings native GPU capabilities plus S3 integration. Everyone needs to gear up for the Spark tsunami.\u00a0\u00a0\nHowever, a fair amount of thrash in this space causes confusion. Major vendors are forcing businesses to shift to the cloud and dump Hadoop\u00a0File System (HDFS) for object storage. And a ton of other dedicated solutions are sprouting up to deliver engineered Spark solutions.\u00a0\nThe real challenge is figuring out how to easily bridge from Spark on YARN technology to the next-generation Spark on a Kubernetes implementation -- without major disruptions to the existing environment. Businesses must also take into account that Spark is just one of many applications they need to support their analytics pipeline.\nWhat to look for? The goal is a solution that simultaneously improves efficiency, agility, elasticity while cutting costs and improving data exploitation capabilities.\u00a0 Ideally, this solution will let data scientists tap into existing data stores without having to move to the cloud or re-platform the data. On the application front, businesses will look to avoid vendor lock-in with multi-version, open-source Kubernetes support without dependencies on Hadoop or YARN.\n\nStateful application modernization \n\nApp modernization is still red hot, and usually people\u2019s minds go straight to the microservices cloud native apps.\u00a0 But over the past 18 months, I\u2019ve seen a radical shift in the open source, ISV, and even the monolithic analytics vendor space (think Splunk, Cloudera, and SAS). \u00a0Businesses are now choosing to embrace the modernization of their applications to be deployed via container-native infrastructure. These traditionally stateful and data-centric workloads are looking to become more cloud-like by improving the efficiency of at-scale deployments and by gaining the elasticity and agility needed to deploy anywhere \u2013 in minutes.\nThe challenge is figuring out the right modern home for these stateful applications. Data science and analytics are a team sport, so these applications will need to share data and models, while orchestrating hand-offs across the analytics lifecycle.\u00a0\nWhat to look for? Businesses are going to quickly need staff that can do more than just spell Kubernetes, but there are \u2018no-coding\u2019 answers to this problem. They will need to look to leverage a container platform that can support (and hopefully is validated with) all these applications and can deliver data at petabyte scale. Businesses will also need to make sure their solution is based on open-source Kubernetes with proven hybrid-cloud capabilities so they can quickly move these workloads between on-prem and the public cloud.\u00a0\n\nSolving for app dev and data-intensive workloads \n\nWhen I go camping, my Swiss army knife is always on my belt, but as the adage goes, a jack of all trades is a master of none. Therefore, I also pack a hammer and hatchet for when the specialty need arises. I\u2019m noticing this same thing from the container offerings. You may have already invested in a technology that is particularly good from the app developer perspective and are now trying to stretch that tool to new spaces.\u00a0\u00a0\nThe challenge is that we all want to minimize solution providers, so we optimistically believe our vendors when they advocate for us to use their tools for things they aren\u2019t natively designed to do. Stateful apps are a different beast -- running petabyte scale analytics is very different from running microservices web search. The scale of 100\u2019s or 1000\u2019s of clusters and\/or hosts per cluster has fundamentally different requirements.\nWhat to look for?\u00a0 Use the right tool for the right job.\u00a0 Don\u2019t be afraid of co-existing multiple platforms to complement your existing solutions and address your varied use cases to deal with scale, performance, and data gravity issues. On the data side, validated CSI drivers is a great start, but you may need a dedicated or integrated high-performance, scale-out data store.\n\nThe edge is here, and you need to solve for both data AND security \n\nWe\u2019ve been reading about the billions of edge devices and IoT trends for years now, and I\u2019m seeing more solutions that have actually operationalized data analytics from edge to cloud. In its simplest form, organizations are bridging their data center with the public cloud, others have brought tens of geo locations together, and others are able to collect data from millions of streaming devices -- even in orbit. Following this trend, analytics are continually becoming more automated and distributed as they move towards the edge points of data creation. This creates a complex matrix of analytic edges\u00a0that themselves are composed of interconnected workloads that come and go, interacting with each other over physical and logical limitations\u2026much like today\u2019s web interactions. \u00a0\nBusinesses face two inherent challenges in edge analytics.\u00a0 First, how do organizations seamlessly bring together data from the many edges, multiple clouds, and on-prem -- while still providing a single, no-silo view of all the data?\u00a0 Secondly, how do businesses liberate analytics to exploit the data across a secure matrix that has no intrinsic attested identity?\nWhat to look for? \nData: A solution that can deliver a common data fabric for all the enterprise\u2019s data on a global scale means faster time to value, better governance, and lower cost. Look for data platforms with proven petabyte scale, hardened enterprise feature set, and proven capabilities (like a global namespace and auto data-tiering) to deliver data from edge to cloud.\nSecurity:\u00a0 A solution that can establish trust in the fluid, interconnected data landscape. Strategies of yesterday to develop trust amongst workloads, like perimeter-based secrets management, are just a band aid that works in the\u00a0near-term but won\u2019t scale. This strategy will leave the business vulnerable to attacks on the application estate that spans beyond the four walls of the data center.\u00a0 Instead, businesses need to look for technologies that can employ Zero Trust security to fully unlock their analytics over the next decade.\nTake analytics to hyperdrive in the 2020s \nData will continue to be nothing without insights. Businesses can\u2019t stand still \u2013 they will look to the 2020\u2019s as the decade to take their analytics to hyperdrive.\nIf you\u2019re looking to learn more on this topic, please check out HPE\u2019s on-demand videos from our popular event \u2013\u00a0HPE EzmeralAnalytics Unleashed. Numerous insightful videos from the event are now available including interviews with analysts, live demos, and a discussion with three of our clients about their analytics journeys. They reveal solutions such as a virtual wallet program, robotic drive for ADAS (advanced driver-assistance systems), and data science as-a-Service.\u00a0\n @geoffreyamoore. Twitter, 12 Aug. 2012, 7:29 p.m., https:\/\/twitter.com\/geoffreyamoore\/status\/234839087566163968?s=20\n____________________________________\nAbout Matthew Hausmann\n\nMatt\u2019s passion is figuring out how to leverage data, analytics, and technology to deliver transformative solutions that improve business outcomes. Over the past decades, he has worked for innovative start-ups and information technology giants with roles spanning business analytics consulting, product marketing, and application engineering. Matt has been privileged to collaborate with hundreds of companies and experts on ways to constantly improve how we turn data into insights.