Meeting the Challenges of Deep Learning Workloads

In a technical validation study, Enterprise Strategy Group found that Dell EMC’s new Deep Learning with Intel solution shortens deployment times, accelerates performance and improves TCO.

readysolutionsai dl intel left
Dell EMC

As artificial intelligence comes of age, organizations are investing capital and human resources — in a big way. A report by the Enterprise Strategy Group (ESG) found that 59 percent of responding organizations expect to significantly increase their spending on AI and machine learning in 2019.1

Findings like these suggest it’s now “full speed ahead” for AI in the enterprise. And to put the AI pedal to the metal, organizations need to deploy tested and optimized infrastructure stacks that are built for the scale and challenges of machine and deep learning. This is the idea behind the growing portfolio of Dell EMC Ready Solutions for AI and the new Deep Learning with Intel solution.

Deep Learning with Intel delivers an optimized solution stack that includes all the hardware, software and services organizations need to get machine learning and deep learning solutions up and running quickly. This scale-out cluster is based on a single Dell EMC PowerEdge™ R740xd master/login node and 16 Dell EMC PowerEdge C6420 dense compute servers. Under the hood, the solution is powered by the Intel® Xeon® Scalable Gold 6148 processor, based on the “Cascade Lake” microarchitecture.

Other performance-driven features in the Deep Learning with Intel solution include Dell EMC Isilon storage, high-speed networking and open-source Nauta software, a data science workbench designed for distributed deep learning using Kubernetes. Nauta provides a multi-user, distributed computing environment for running deep learning model-training workloads on Intel Xeon Scalable processor-based systems.

Technical validation

So, how does this new solution stand up to the challenges of deep learning workloads? The Enterprise Strategy Group (ESG) answers this question in a recently released technical validation report commissioned by Dell EMC and Intel.

In its technical validation study, ESG focused on understanding the performance, ease of use and total cost of ownership (TCO) of the Deep Learning with Intel solution. To validate the full stack performance, the firm measured the number of tokens per second processed when training the Token2Token Big Transformer model, and it evaluated how Nauta accelerates deep learning model training. ESG also compared how Nauta simplifies the deep learning training process, and how the TCO of the Deep Learning with Intel solution compares to running the same tasks on a leading public cloud AI service.

The results

ESG found that the Deep Learning with Intel solution is tuned and optimized for AI initiatives, and that it shortens deployment times, simplifies the data scientists’ workflow and workload, accelerates performance, and improves TCO.  

Here are some specific findings from the technical validation, excerpted from the ESG report.2

  • ESG validated that using Nauta improves deep training performance on the Deep Learning with Intel solution. Containerized training workloads ran up to 18 percent faster than the same workloads on a bare-metal system. Further, the solution achieves near-linear scaling, achieving 80 percent of theoretical maximum performance when the number of compute nodes is scaled from one to 16.
  • Nauta platform’s orchestration and automation systems simplified model development, significantly reducing the number of steps in the workflow. Nauta’s automation enabled unattended hyperparameter tuning, simplifying the arduous and tedious task and enabling data scientists to focus on core data science tasks.
  • Deep Learning with Intel proved to be more cost effective than running the same workload in the public cloud. Over three years, a 16-compute-node solution was 24 percent cheaper than a leading public cloud AI service for deep learning training. The three-year TCO for Deep Learning with Intel provided 24 hours per day compute availability and 100 TB of storage capacity compared to only 12 hours per day compute consumption and 10TB storage consumption for the public cloud service.

The bottom line

To capitalize on the massive amounts of data they collect every day, organizations need cost-effective machine learning and deep learning solutions that provide performance and scalability for complex AI models while simplifying startup and deployment. The ESG technical validation confirms that the Deep Learning with Intel solution from the Dell EMC Ready Solutions for AI portfolio fits the bill while providing these capabilities at a great TCO.

Learn more

To learn more about the study, download the ESG Technical Validation Report. And to learn more about the Deep Learning with Intel solution, explore Dell EMC Ready Solutions for AI.

_____________________

1 ESG Research Report, “2019 Technology Spending Intentions Survey,” February 2019. 

2 ESG Technical Validation, “Dell EMC Ready Solutions for AI: Deep Learning with Intel,” April 2019.

Copyright © 2019 IDG Communications, Inc.