How to Use an NVIDIA GPU with Docker Containers
Quick Links
Docker containers don't see your system's GPU automatically. This causes reduced performance in GPU-dependent workloads such as machine learning frameworks. Here's how to expose your host's NVIDIA GPU to your containers.
Making GPUs Work In Docker
Docker containers share your host's kernel but bring along their own operating system and software packages. This means they lack the NVIDIA drivers used to interface with your GPU. Docker doesn't even add GPUs to containers by default so a plain docker run
won't see your hardware at all.
At a high level, getting your GPU to work is a two-step procedure: install the drivers within your image, then instruct Docker to add GPU devices to your containers at runtime.
This guide focuses on modern versions of CUDA and Docker. The latest release of NVIDIA Container Toolkit is designed for combinations of CUDA 10 and Docker Engine 19.03 and later. Older builds of CUDA, Docker, and the NVIDIA drivers may require additional steps.
Adding the NVIDIA Drivers
Make sure you've got the NVIDIA drivers working properly on your host before you continue with your Docker configuration. You should be able to successfully run nvidia-smi
and see your GPU's name, driver version, and CUDA version.
To use your GPU with Docker, begin by adding the NVIDIA Container Toolkit to your host. This integrates into Docker Engine to automatically configure your containers for GPU support.
Add the toolkit's package repository to your system using the example command:
distribution=$(. /etc/os-release;echo $ID$VERSION_ID)&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
Next install the nvidia-docker2
package on your host:
apt-get updateapt-get install -y nvidia-docker2
Restart the Docker daemon to complete the installation:
sudo systemctl restart docker
The Container Toolkit should now be operational. You're ready to start a test container.
Starting a Container With GPU Access
As Docker doesn't provide your system's GPUs by default, you need to create containers with the --gpus
flag for your hardware to show up. You can either specify specific devices to enable or use the all
keyword.
The nvidia/cuda
images are preconfigured with the CUDA binaries and GPU tools. Start a container and run the nvidia-smi
command to check your GPU's accessible. The output should match what you saw when using nvidia-smi
on your host. The CUDA version could be different depending on the toolkit versions on your host and in your selected container image.
docker run -it --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi
Selecting a Base Image
Using one of the nvidia/cuda
tags is the quickest and easiest way to get your GPU workload running in Docker. Many different variants are available; they provide a matrix of operating system, CUDA version, and NVIDIA software options. The images are built for multiple architectures.
Each tag has this format:
11.4.0-base-ubuntu20.04
11.4.0
- CUDA version.base
- Image flavor.ubuntu20.04
- Operating system version.
Three different image flavors are available. The base
image is a minimal option with the essential CUDA runtime binaries. runtime
is a more fully-featured option that includes the CUDA math libraries and NCCL for cross-GPU communication. The third variant is devel
which gives you everything from runtime
as well as headers and development tools for creating custom CUDA images.
If one of the images will work for you, aim to use it as your base in your Dockerfile
. You can then use regular Dockerfile instructions to install your programming languages, copy in your source code, and configure your application. It removes the complexity of manual GPU set up steps.
FROM nvidia/cuda:11.4.0-base-ubuntu20.04RUN apt update
RUN apt-get install -y python3 python3-pip
RUN pip install tensorflow-gpu
COPY tensor-code.py .
ENTRYPONT ["python3", "tensor-code.py"]
Building and running this image with the --gpus
flag would start your Tensor workload with GPU acceleration.
Manually Configuring an Image
You can manually add CUDA support to your image if you need to choose a different base. The best way to achieve this is to reference the official NVIDIA Dockerfiles.
Copy the instructions used to add the CUDA package repository, install the library, and link it into your path. We're not reproducing all the steps in this guide as they vary by CUDA version and operating system.
Pay attention to the environment variables at the end of the Dockerfile - these define how containers using your image integrate with the NVIDIA Container Runtime:
ENV NVIDIA_VISIBLE_DEVICES allENV NVIDIA_DRIVER_CAPABILITIES compute,utility
Your image should detect your GPU once CUDA's installed and the environment variables have been set. This gives you more control over the contents of your image but leaves you liable to adjust the instructions as new CUDA versions release.
How Does It Work?
The NVIDIA Container Toolkit is a collection of packages which wrap container runtimes like Docker with an interface to the NVIDIA driver on the host. The libnvidia-container
library is responsible for providing an API and CLI that automatically provides your system's GPUs to containers via the runtime wrapper.
The nvidia-container-toolkit
component implements a container runtime prestart
hook. This means it's notified when a new container is about to start. It looks at the GPUs you want to attach and invokes libnvidia-container
to handle container creation.
The hook is enabled by nvidia-container-runtime
. This wraps your "real" container runtime such as containerd or runc to ensure the NVIDIA prestart
hook is run. Your existing runtime continues the container start process after the hook has executed. When the container toolkit is installed, you'll see the NVIDIA runtime selected in your Docker daemon config file.
Summary
Using an NVIDIA GPU inside a Docker container requires you to add the NVIDIA Container Toolkit to the host. This integrates the NVIDIA drivers with your container runtime.
Calling docker run
with the --gpu
flag makes your hardware visible to the container. This must be set on each container you launch, after the Container Toolkit has been installed.
NVIDIA provides preconfigured CUDA Docker images that you can use as a quick starter for your application. If you need something more specific, refer to the official Dockerfiles to assemble your own that's still compatible with the Container Toolkit.
ncG1vNJzZmivp6x7qbvWraagnZWge6S7zGibnq6fpcBwtM6wZK2nXarApnnAp2SnrpmZtqJ5xqmsZq%2BZqbVusM6cop6qXZi8r8DAoqWeqqNk