Over the past five years, the use of artificial intelligence (AI) has exploded, giving rise to several new conferences on the topic. However, the premier event has become Nvidia’s GPU Technology Conference (GTC), the latest of which was held in San Jose, California, last week. Attendance was a whopping 8,500, about 2,000 more than GTC17. (Note: Nvidia is a client of ZK Research.)
While the show is hosted by Nvidia, it has become more of an industry show where some of the biggest challenges in scaling AI are being discussed and solved.
As has become customary at GTC, Nvidia made a number of announcements. Here are the ones that I felt were most notable and why CIOs should care.
One of the challenges with GPU computing is pushing enough data into the GPUs to keep them busy, particularly for multi-GPU systems. Typically, at high data loads, the PCI bus is too slow, so Nvidia developed a proprietary interconnect to bridge two GPUS together, called NVLink, that bypasses the PCI bus. At GTC18, Nvidia debuted a new innovation called NVSwitch that enables 16 GPUs to be tied together via a high-speed fabric.
During his keynote, CEO Jensen Huang showed off one of the 16 GPU systems and joked that it was the world’s largest GPU because it effectively creates one massive GPU. Network professionals will be familiar with the architecture, for this kind of crossbar fabric is used as the interconnect for most high-speed switches today — hence the name NVSwtich.
Why CIOs should care: The volume of data that businesses have to analyze will only go up. In fact, the use of general adversarial networks (GANs) enables machines to generate their own synthetic data. NVSwitch helps organizations interconnect more GPUs together so data scientists can work on larger, more complex datasets.
Double the memory for Tesla V100
About a year ago, Nvidia launched the Tesla V100 GPU, which it came with 16GB of HBM2 memory. AT GTC18, the company announced it was doubling the total memory to 32 GB immediately. The upgrade means Nvidia is now at memory parity with its main competitor, AMD. The performance of the V100 has always blown the doors off the comparable AMD product; however, the total memory was one place it lagged — and now it has closed that gap.
The memory upgrade means that the 16GB versions will be discontinued, so the DGX-1 server and DGX station will ship with the new processors. Also, all the server manufacturers, including Cray, Hewlett Packard Enterprise, IBM, Lenovo, Supermicro, and Tyan will begin shipping their products with the 32GB version of Tesla V100 in the second half of 2018.
Why CIOs should care: The amount of memory is the only thing changing, so data won’t be processed faster. However, workloads that use datasets that could not fit into 16GB of memory will see a significant boost, as will other data center applications that require more local memory.
The DGX-1 Server and DGX Station are used broadly by data scientists in a number of different verticals. For example, the MGH and BWH Center for Clinical Data Science group uses DGX Station as part of its research. Both of these products are high performance, with the DGX-1 reaching 1 petaflop of performance.
But for some use cases, that might not be enough horsepower. For those, Nvidia showed off the DGX-2, the first server to use the NVSwitch innovation to bring 16 GPUs of power to a single, unified memory space. This new server is about 2x the physical size of the DGX-1, but Nvidia claims it has about 10x the processing power.
During his keynote, Huang showed the time taken to train the FAIRSeq neural machine translation model with DGX-2 and DGX-1, and it revealed the model completed in two days with DGX-2 versus about three weeks with DGX-1. The price point of the DGX-2 will be $399,000, which seems very reasonable given this single box would replace racks of servers that use traditional CPUs.
Why CIOs should care: For those at the high end of the deep learning market, DGX-2 will do things exponentially faster than DGX-1. Given how competitive the world is, completing things in days versus weeks means those expensive data scientists will spend more time analyzing and less time waiting for tasks to finish.
Kubernetes containers orchestration on Nvidia GPUs
This announcement might have the broadest industry impact on AI and deep learning, as Kubernetes support significantly simplifies the deployment of AI models in hybrid cloud environments. The idea is that cloud-based containers can be used for burst capacity or for resiliency.
The best way to understand it is by explaining the demonstration the company did. It started by showing images of flowers being analyzed by a CPU and then a GPU — with the GPU obviously being faster. Then it applied the images to eight GPUs, and the rate of analysis picked up. The demo then disconnected from a cluster because of load, which caused the Kubernetes controller to dynamically spin up four new nodes in the Amazon Web Services (AWS) cloud and automatically load balance across them — all in real time — and the images were being processed at an accelerated rate.
Kubernetes is now GPU-aware, and Docker containers are CPU-accelerated. One final note, this feature works on Kubernetes on premises or in a managed container service such as AWS EKS.
Why CIOs should care: The value of being able to leverage hybrid clouds for things like AI can’t be understated. Containers are the ideal environment for a distributed machine and deep learning-based workloads and model training, as they can be spun up on demand. IT leaders can work with data science teams and create highly agile environments without breaking the bank.
From an enterprise IT perspective, these were the announcements I thought were worth calling out. But for people with a deeper interest in AI, there were other cool announcements, such as the Isaac robot simulator being made available publicly, RTX ray tracing technology, and Project Clara, a dedicated supercomputer for the healthcare industry — but these are certainly more specialized.
Final note: Huang is one of the most passionate and inspirational people I have ever met with respect to how AI will change the way we work and the way we live. Given the recent accidents with Uber and Tesla, some skeptics wonder whether AI is good for us. The fact is, machines can connect the dots hidden in data better than people can. And the more data we have, the more we need machines to do that.
The key is to think about how machines can be used to improve our lives. For example, Huang was very clear that there’s a tremendous amount of value with self-driving cars, but initially cars should assist the driver and make driving safer. For example, the car can study the driver and know when he or she is falling asleep or if the person has had their eyes off the road too long and make a noise to alert the driver or stop operations altogether.
The world is changing fast, and CIOs need to be prepared for this and embrace and accept change by considering what’s possible.