Using GPUs with Virtual Machines on vSphere – Part 1: Overview

This is part 1 of a series of blog articles that give technical details of the different options available to you for setting up GPUs for compute workloads on vSphere.

Part 1 of this series presents an overview of the various options for using GPUs on vSphere

Part 2 describes the DirectPath I/O (Passthrough) mechanism for GPUs

Part 3 gives details on setting up the NVIDIA Virtual GPU (vGPU) technology for GPUs

Part 4 explores the setup for the Bitfusion Flexdirect method of using GPUs

Your company’s data scientists, machine learning practitioners or developers have asked you to provide them with a GPU-capable machine setup to do their work. They want to be able to execute workloads that need GPU compute power. The data scientist describes the workloads as machine learning “training”, “inference” or “development”. We will explain what they mean by these terms later on in this series of articles. This opening article gives an overview of the various options open to you to provide the required infrastructure on VMware vSphere.

The reason your end-users need GPU capability is simply faster time to results. Machine learning models involve very large matrix multiplications and GPUs are designed to compute these operations much faster than CPUs.

Your company is probably already using virtual machines on vSphere for developers/testers and data management people, but a key question in your mind now is:

Can GPUs be used in vSphere for applications other than VDI?

The short answer is a resounding ‘yes’. We call this use case “GPU Compute” in vSphere. In its simplest form, VMware vSphere allows your end users to consume GPUs in VMs in the same way they do in any GPU-enabled public cloud instance or on bare metal. In addition, through collaboration with our technology partners, vSphere allows multiple flexible consumption and GPU utilization models that can increase the ROI on ownership of this infrastructure, while providing your end-users exactly what they need.

This article will help you navigate through the process of fulfilling that original end-user request. You’ll understand what to ask the end-users and your hardware and software vendors. It presents the various alternatives to you for consideration – as different implementations suit different scenarios of use.

What about performance?

Generally, a GPU within a vSphere virtual machine can deliver near bare-metal performance, though the exact performance is dependent on the technology used.  We will touch upon the performance characteristics of each technology in the subsequent parts of this series. For an initial glance at some performance numbers see this post from VMware’s performance engineering team.

Different Methods of GPU Usage with Virtual Machines

One of your very early decision points as a system administrator is to decide how exactly the GPUs will be used in your environment. As mentioned, there are different ways of consuming GPUs through virtual machines.  The approach you decide on for this will largely depend on the type of users and applications that will be making use of the GPUs for their applications. The options are shown in Table 1.

Table 1: GPU configurations and their respective use cases

The types of technology that apply to these three different situations are shown in the lower part of Figure 1.

Figure 1: A decision tree for different GPU use cases on vSphere

As you can see, some use cases are enabled by VMware partners’ products such as the NVIDIA Virtual GPU, also called “NVIDIA vGPU” technology. This family was formerly named “NVIDIA Grid”. The NVIDIA vGPU set is a family of software products that includes the NVIDIA Virtual Compute Server(vCS) software product as well as others, such as vDWS.

Each technology comes with its own pros and cons and provides different levels of flexibility and end user experience while leveraging inherent vSphere technologies to realize synergies between their product and the vSphere platform.  VMware is committed to continue working with OEMs, HW and SW vendors in the hardware acceleration ecosystem. The goal is to allow customers to extract maximum value from their modern infrastructure while easing its management and consumption.

In the following parts of this series, we will detail the steps required and the technologies available to enable one full dedicated GPU in a vSphere VM and how to share a GPU across multiple VMs. Part 2 of this series on using DirectPath I/O for GPUs is here. Part 3 of the series on installing the NVIDIA vGPU products for GPUs on vSphere is here.