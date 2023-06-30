Luke Roquet recently spoke to a customer who recounted the shock of getting a $700,000 bill for a single data science workload running in the cloud. When Roquet, who is senior vice president of product marketing at Cloudera, related the story to another customer, he learned that that company had received a $400,000 tab for a similar job just the week before.\n\nSuch stories should belie the common myth that cloud computing is always about saving money. In fact, \u201cmost executives I\u2019ve talked to say that moving an equivalent workload from on-premises to the cloud results in about a 30% cost increase,\u201d said Roquet.\n\nThis doesn\u2019t mean the cloud is a poor option for data analytics projects. In many scenarios, the scalability and variety of tooling options make the cloud an ideal target environment. But the choice of where to locate data-related workloads should take multiple factors into account, of which only one is cost.\n\nData analytics workloads can be especially unpredictable because of the large data volumes involved and the extensive time required to train machine learning (ML) models. These models often \u201chave unique characteristics that can cause their costs to explode,\u201d Roquet said.\n\nWhat\u2019s more, local applications often need to be refactored or rebuilt for a specific cloud platform, said David Dichmann, senior director of product management at Cloudera. \u201cThere's no guarantee that the workload is going to be improved and you can end up being locked into one cloud or another,\u201d he said.\n\nCloud march is on\n\nThat doesn\u2019t seem to be slowing the ongoing cloudward migration of workloads. Foundry\u2019s 2022 Data & Analytics study found that 62% of IT leaders expect the share of analytics workloads they run in the cloud to increase.\n\nAlthough cloud platforms offer many advantages, cost- and performance-sensitive workloads \u201care often better run on-prem,\u201d Roquet said.\n\nChoosing the right environment is about achieving balance. The cloud excels for applications that are ephemeral, need to be shared with others, or use cloud-native constructs like software containers and infrastructure-as-code, he said. Conversely, applications that are performance- or latency-sensitive are more appropriate for local infrastructure where data can be co-located, and long processing times don\u2019t incur additional costs.\n\nThe goal should be to optimize workloads to interact with each other regardless of location and to move as needed between local and cloud environments.\n\nThe case for portability\n\nDichmann said three core components are needed to achieve this interoperability and portability:\n\n\u201cOnce you have one view of all your data and one way to govern and secure it then you can move workloads around without worrying about breaking any governance and security requirements,\u201d he said. \u201cPeople know where the data is, how to find it, and we\u2019re all assured it will be used correctly per business policy or regulation.\u201d\n\nPortability may be at odds with customers' desire to deploy best-of-breed cloud services, but Dichmann said \u201cfit-for-purpose\u201d is a better goal than best-of-breed. That means it\u2019s more important to put flexibility ahead of bells and whistles. This gives the organization maximum flexibility for deciding where to deploy workloads.\n\nA healthy ecosystem is also just as important as robust points solutions because a common platform enables customers to take advantage of other services without extensive integration work.\n\nThe best option for achieving workload portability is to use an abstraction layer that runs across all major cloud and on-premises platforms. The Cloudera Data Platform, for example, \u201cis a true hybrid solution that provides the same services both in the cloud and on-prem,\u201d Dichmann said. \u201cIt uses open standards that give you the ability to have data share a common format everywhere it needs to be, and accessed by a broader ecosystem of data services that makes things even more flexible, more accessible and more portable.\u201d\n\nVisit Cloudera to learn more.