Key Considerations for AI and HPC in the Cloud or On-Premises

A new report by Moor Insights & Strategy offers expert insights into the pros and cons of cloud vs. on-premises deployments for artificial intelligence and high performance computing.

shutterstock 11752802 1280x1280
Dell EMC

As organizations push forward with artificial intelligence and high performance computing initiatives, a strategic question often arises: Could we get there faster and more affordably, and get more of what we need via the cloud, rather than building our own on-premises environment?

This debate is the topic of a new white paper by Moor Insights & Strategy, titled “AI and HPC: Cloud or On-Premises Hosting.”  In this paper commissioned by Dell EMC, the research firm explores some of the common considerations that help organizations determine whether to build and host their AI applications in a public cloud or build on-premises HPC infrastructure to support their AI needs. For most organizations, this isn’t a straightforward question.

“While the industry trend is clearly to move new applications to the cloud, AI and HPC workloads have performance, data requirements, and utilization characteristics that could lead one to go in the opposite direction,” Moor Insights & Strategy notes.

To help organizations explore this all-important question, the research firm walks through common factors that determine where AI and HPC workloads should be hosted. Here’s a look at of some of these considerations.

  • Startups vs. enterprises — Startups and enterprises have different initial considerations. Most startups begin the AI journey using cloud-hosted services. Many then outgrow this stage and reach the point where renting infrastructure is no longer more affordable than owning it. Enterprises, in turn, tend to start the AI journey in the cloud and then move to their own hardware once they have production models and know they can keep their servers and processors busy.
  • Setup costs — Setup costs are typically lower for cloud-based AI and HPC development. However, Moor Insights & Strategy notes that Dell EMC is addressing setup issues with new fully configured Ready Solutions for AI. Other steps forward include new solutions that reduce setup costs for AI frameworks and HPC application stacks. Given advances like these, “the common perception that cloud offers an easier path to get started is perhaps overstated,” the firm says.
  • Operating and capital costs — The question of the relative costs of cloud vs. on-premises systems isn’t as simple as it might seem. Moor Insights & Strategy notes that in the early stages of a project, which are dominated by experimentation, operating costs for training and inference work in the cloud are roughly comparable to on-premises hosting. However, as time goes on, “more intensive usage in the cloud drives up costs significantly as an AI project begins to scale.”
  • Data and application locality — This consideration revolves around the concept of “data gravity.” As Dell EMC’s Sai Kumar Devulapalli explains in a recent blog post, data gravity refers to the natural attraction of data and applications. In simple terms, as datasets grow larger, they become harder to move. So, the data stays put, and the applications and processing power come to it. Moor Insights & Strategy points out that, in light of data gravity issues, it usually makes more sense to keep the application co-resident with the datasets that the application uses. “If the data is already in the cloud, then the application should likely be located there as well,” the firm says. “This is especially true for HPC and AI applications, since the training datasets typically span terabytes.”
  • Security and privacy — While Moor Insights & Strategy believes CSPs have done a reasonably good job of ensuring privacy and security for their users’ data, it notes that some organizations may still be uncomfortable with the thought of sending highly confidential data outside their firewalls and facilities. Moreover, companies in highly regulated industries like healthcare and financial services may be better off with secure on-premises IT.
  • Performance requirements — Moor Insights & Strategy notes that nearly all public cloud providers now offer accelerators in their machine learning as a service (MLaaS) infrastructure. With this processing power under the hood, the compute bandwidth in the cloud is now roughly equivalent to what is possible with on-premises infrastructure. However, there is the latency question that comes with distant cloud data centers. While cloud latency for inferencing is typically low, Moor Insights & Strategy says tests have shown that latency can degrade performance significantly when cloud services are accessed across large distances. “Latency generally increases by about one millisecond for every 60 miles, which can be a substantial hit for remote locations,” the firm says.

This is just a partial list of the common considerations covered in the paper, which also highlights issues related to geographic location, the stage of the application lifecycle, bursty workloads and the need for services.

The bottom line

Moor Insights & Strategy suggests that, when it comes to AI and HPC deployments, starting with the cloud may make a lot of sense if an organization wants to experiment with AI and begin building deep neural networks. However, at some point after those initial forays, the shift to on-premises infrastructure may be the logical route forward.

“Many organizations will eventually need significant computing infrastructure for AI and HPC as their applications begin to run at scale,” the firm notes. “This, along with data transfer and throughput fees, begins to tip the cost balance in favor of building on-premises infrastructure as the organization matures in AI.”

To learn more

For the full story, read the Moor Insights & Strategy paper “AI and HPC: Cloud or On-Premises Hosting.” To learn more about unlocking the value of data with artificial intelligence systems, explore Dell EMC HPC Solutions and Dell EMC AI Solutions.

Copyright © 2019 IDG Communications, Inc.