In this two-part series, I explore the two phases of digital transformation that many organizations are undergoing. In part one, I dig into what organizations have done in the first phase of transformation and why they must think differently as they embark on the second phase. In part two, I describe how organizations should approach the second phase of transformation in order to successfully transform their data and analytics estates \u2013 with Spark as the foundation of those changes.\nConquering the last frontier of digital transformation\nAs I talk to my clients in organizations of every size and industry, I sense a generational shift in both their technology and business strategy in the area of advanced analytics. I define advanced analytics as the exploitation of an organization\u2019s data assets through sophisticated data science tools and techniques performed by data scientists. Digging further, we can see that this isn\u2019t traditional business intelligence and reporting using legacy and modern reporting tools (such as QlikView, Tableau, and Power BI.) No, this sort of analytics is often ad hoc, using bespoke combinations of tools, libraries, and analytical techniques against many types of data types and sources.\nMany organizations are using advanced analytics now because they have completed the first few phases of their digital transformation projects and are moving on to the last frontier \u2013 tackling the data and analytics systems and processes to fully transform.\nWhat exactly does that mean, though? It means that the analog-to-digital transformations are complete. It also means that the traditional IT environments have been transformed to be more efficient and services-driven and applications are now using cloud technologies and operating models. That leaves the data and analytics components where value can still be extracted and exploited. The question we must ask is, \u201cHow do we bring those learnings from application modernization, tooling from the DevOps processes, and operating models from the cloud to the data and analytics estate?\u201d The answer lies in lessons learned from existing application modernization efforts.\nLessons learned from the first phase of transformation \n\nApplication modernization\u00a0Application modernization includes new software development methodologies, tools, and processes coupled with a change in organizational structures and processes to be software driven. New programming languages have emerged to make writing, testing, and deploying code more assessable to software teams. That has allowed the lines of business within organizations to better understand software development and align more closely with it; this enables better integration with traditional IT, letting them become technology-driven business units. Those changes didn\u2019t happen over-night \u2013 but when completed, I have seen improvements that are orders of magnitude more efficient and impactful than previous technology deployments.\n\n\nDevOps processes\u00a0Writing better code using public cloud tooling is only part of what has made the recent digital transformations effective. DevOps has accelerated these transformations, which have been instrumental in breaking down the barriers between application development and IT operations. With that problem solved, organizations were able to truly start using IT as a force-multiplier additive to their digital transformation. Organizations that have a \u201cDevOps mentality\u201d are poised for success in the next phase of their transformation.\n\n\nOperating models\u00a0The public cloud has transformed the operating models of many organizations in many ways. From the way IT departments extend their own capabilities through hybrid-cloud initiatives to the way application developers use cloud-native services and functions \u2013 organizations have continued to increase their business velocity by embracing cloud principles and operating models. OpEx vs. CapEx, self-service, on-demand provisioning, elastic scaling, micro-charging, and bespoke provisioning of resources are all game-changing practices that have transformed the way organizations treat technology.\n\nData and analytics require a different way of thinking\nThe second phase of digital transformation for most organizations will be data and analytics focused. Best of breed organizations will apply the best practices from their application modernization transformations to this phase. Data and analytics is different enough, however, that it requires slightly different thinking, tooling, and approaches while keeping those patterns in mind to be truly successful. Let\u2019s look at why the data and analytics space is different so that we can understand what must be done differently than what was done for application modernization.\nIn the data and analytics space, software development generally falls under two categories: data engineering and data analytics\/data science. Up until recently, these developers worked on tooling and systems that are 10+ years old using languages and environments that are sometimes even older. That is because these systems are part of critical business reporting and intelligence functions that are slow to change because the business doesn\u2019t need them to change. Therefore, these systems are treated with a light touch and are only changed with the utmost care. Writing software within the organization against these systems is almost always done using highly controlled development and operational processes that are slow to change with few iterations.\nLooking past those traditional reporting systems, big data systems have evolved to integrate with more modern software development and languages, but the deployment of code and applications against them has still been rigid because these systems are often deployed as monoliths (i.e. \u00a0the individual components of the system are tightly coupled and have to be updated and deployed at the same time). That means that DevOps tooling and processes are incompatible with these systems and therefore they are unable to benefit from agile techniques, continuous integration, and delivery tooling.\nIn most organizations I speak to, most of these data and analytics systems are anywhere from hundreds of terabytes to exabyte scale. And in these systems, the data is usually tightly coupled to the applications, forming enormous monoliths making cloud deployments, at best impracticable and in most cases, impossible. That leaves public cloud deployments of these systems impossible due to cost concerns, network latency issues, and legal\/regulatory policies that forbid those deployments. Being unable to deploy these systems into the public cloud means that the benefits mentioned above are not possible and therefore we can\u2019t just apply those principles and operating models without rethinking our approach.\nApply patterns learned \u2013 along with new tools, techniques, and processes \nIn reading this article, my hope is that you\u2019ve gained insight into the challenges that organizations face as they embark on the next phase of digital transformation. As you have read, there are lessons we have learned from the recent application modernization transformations. Although we can\u2019t take all of the tooling, principles, and processes and apply them to the data and analytics estate, we can take those patterns and apply new software development tools, techniques, and processes, then apply cloud principles and operating models to drive that transformation.\nIn the next article I will go into the best practices for the digital transformation of data and analytics systems and organizations and provide advice on how to accelerate that journey. The focus will be on how the open-source solution for Spark is the common technical component enabling and accelerating the data and analytics digital transformation.\n____________________________________\nAbout Matt Maccaux\n\nAs Global Field CTO for HPE Ezmeral software, Matt brings deep subject-matter expertise in big data analytics and data science, machine learning, application development & modernization, and IoT as well as cloud, virtualization, and containerization technologies.