Docker with nVidia's CUDA Support
Months ago … I setup Docker with nVidia support on pokey old computer with a PNY K620 Quadro GPU.
(These notes are somwhat old - sorry)
Host Setup
I used Ubuntu Server 18.04.1 for amd64 as the host OS. It was a fresh installation so this should be possible for anyone to follow. Everything else was downloaded as part of these instructions.
Software Setup
Pre-Install
The system reqiores some setup to build support for the functionality discussed. Since this was a totally “fresh” system, so I merely had to install GCC and the Linux headers;
$ sudo apt-get install gcc
$ sudo apt-get install linux-headers-$(uname -r)
CUDA Install
My notes state that I used “version 10
” of the CUDA SDK …but … the link I saved reffers to 10.1u1?
Regardless, I setup the .deb
file as I was instructed;
$ sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cuda
This now needs setting up some environment variables setup.
But how do we do that?
Linux seems to change what the right way is for this three times a decade.
According to this, editing the .profile
is in vouge so I added these lines to the end of the file.
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}
There were some instructions relating to POWER9
but I was unable to get anything to happen with them.
After a reboot, nvidia-smi
showed me some kerbunk that made me believe that the GPU was/is operating correctly.
Docker
For $reasons you need Docker’s .deb
package.
I found some setup instructions here which led to the following steps.
$ sudo apt update
$ sudo apt install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
$ sudo apt update
$ apt-cache policy docker-ce
$ sudo apt install docker-ce
$ sudo systemctl status docker
$ sudo usermod -aG docker ${USER}
nVidia-Docker
Nothing complex left, I followed the instructions;
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo pkill -SIGHUP dockerd
Verification
I ran the suggested command; looks like it’s running the nvidia-smi
inside a throwaway container based on the nvidia/cuda:9.0-base
image.
$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
It dumps the wall of text below.
psxpl3@glaze:~$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
Unable to find image 'nvidia/cuda:9.0-base' locally
9.0-base: Pulling from nvidia/cuda
7b722c1070cd: Pull complete
5fbf74db61f1: Pull complete
ed41cb72e5c9: Pull complete
7ea47a67709e: Pull complete
35400734fa04: Pull complete
195acf8a5739: Pull complete
127028f911f6: Pull complete
Digest: sha256:157d05b8a9f3a26dce71c9e824d3fab769d77326f471d0143a236c37d278450d
Status: Downloaded newer image for nvidia/cuda:9.0-base
Thu Feb 14 16:34:44 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro K620 Off | 00000000:03:00.0 Off | N/A |
| 35% 44C P0 2W / 30W | 0MiB / 2001MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
… I still think we should be using some sort of OpenCL/GLES3/Vulkan rather than CUDA … #fiteme