For many years now, IT organizations have been virtualizing servers to gain greater value from their hardware investments. With tools from VMware, Microsoft and other vendors, IT managers could abstract the resources from a single host server and make them available to multiple users via virtual machines. This move to virtualized servers greatly increased asset utilization while making it easier to provision, deploy, manage and protect end-user systems.
Server virtualization was clearly a big leap forward in our approach to the hardware layer of the data center stack. Yet for all its benefits, this hardware virtualization didn’t help us solve the problem of how we simplify the provisioning, distribution and management of the software environments that run on top of the virtualized hardware layer.
Today, IT organizations are increasingly addressing this other piece of the puzzle through the use of software containers. This approach to the software layer brings a distinct set of benefits, which I will explore in this post. But first, let’s begin with a quick primer on containers.
How containers work
A container is essentially a package that bundles up a software application and all of the components needed to deploy and run the software on compatible systems. The container includes the virtualized operating system and all its software dependencies.
Containers make it possible to duplicate a software environment across many different virtual or physical machines. You don’t have to worry about replicating the software environment on the receiving end. You just run the container on top of the host operating system.
To make this story more tangible, let’s consider the case of an application that runs on the Linux operating system. The container provides a few essential pieces of software to emulate the particular flavor of the supported Linux OS. This includes the set of files necessary for the executable and its dependent libraries for the particular flavor of Linux — Red Hat, CentOS, Ubuntu, Clear Linux or whatever it happens to be. The container allows you to specify what the container’s base operating system is going to appear to be.
On top of that virtualized abstracted operating system inside the container, you can install your own software packages and dependencies. You don’t have to install them in the host OS since they are installed in the container. And then when you load the container, all of those dependent software packages are there.
Containers in deep learning applications
Containers are an ideal way to distribute the tools for deep learning. If you are doing deep learning training, you can bundle up the chosen deep learning framework —TensorFlow, PyTorch, Cognitive Toolkit or whatever you’re using — and all the dependent libraries and packages for the framework. You can bundle everything up all the way down to the linear algebra routines that are used to do the matrix multiplication, image processing libraries, text processing libraries and anything else you need to get your job done. All of those dependent parts go inside the container so they travel with the application.
The deep learning container can then be easily deployed on other machines, across multiple people, multiple instances and multiple sites. There is no need for tinkering on the far end, with users trying to figure out how to get the application to run. All they have to do is run the container.
At Dell EMC, we are using containers to enable the new Deep Learning with Intel solution in the portfolio of Dell EMC Ready Solutions for AI. This solution is powered by Nauta, an open-source deep learning platform initiated by Intel. Nauta provides a platform for running deep learning models with Kubernetes orchestration of Docker containers.
Nauta includes a prebuilt container with all the parts needed to do basic neural network training tasks. It contains all of TensorFlow’s dependent libraries and the Intel® Math Kernel Libraries optimized for matrix-vector and matrix-matrix multiplication, plus the new Intel® MKL-DNN library, which is fine tuned and tailored specifically for deep neural network operations. The container also includes the Horovod distributed training framework, so that data scientists can take full advantage of the solution’s multiple compute nodes. Nauta puts this together into one container packaged with a solution that allows you to do highly optimized neural network training on Intel® Xeon® Scalable processors right out of the box.
This brings us to the benefits of containerization of software applications and their dependent parts for DevOps teams. These benefits revolve around scalability, deployment speed, performance and management.
As for scalability and deployment speed, containers allow DevOps teams to bundle up all of their software in one image, or one container, and distribute that image to many users. Nobody has to spend a lot of time ramping up and making sure they understand the environment. You just give them the container and they are ready to go. You can even put an executable in the container and then run the software as a service, if that makes sense for the deployment.
This approach can greatly accelerate the onboarding process for new users. Instead of taking two or three days to ramp up, they might get going in an hour. All they need to do is pull the container from either a public or a private enterprise container registry.
And then there’s the performance side of the story. Unlike virtual machines, which abstract the hardware and therefore incur a performance penalty, containers are still running on the physical infrastructure. In fact, benchmark testing has shown that there is no performance penalty for the use of containers when compared to running the same software on bare-metal servers.
And then if we take a step back and look at the bigger picture, it’s pretty obvious that containerization has important management benefits. The IT administrator no longer has to prop up a server and spend hours working with a data scientist, a software developer, or DevOps engineer to make sure the software environment is right to run the application. The IT administrator only has to set up a server capable of running a container, and then the container brings all of the parts necessary to run the application. This shifts the management burden from the IT support staff to the container creator.
Containerization brings compelling benefits to the process of developing and distributing software and their dependent environments. These benefits are propelling containerization to the forefront of DevOps work. Containers will be the predominant way that software is deployed and managed in the future. There’s no better way than containers to easily move your software from the edge to the core to the cloud.
To learn more about unlocking the value of data with artificial intelligence systems, explore Dell EMC AI Solutions.
Lucas Wilson, Ph.D., is an artificial intelligence researcher and lead data scientist in the HPC and AI Innovation Lab at Dell EMC.