Docker with nVidia's CUDA Support
Months ago … I setup Docker with nVidia support on pokey old computer with a PNY K620 Quadro GPU.
(These notes are somwhat old - sorry)
Host Setup
I used Ubuntu Server 18.04.1 for amd64 as the host OS. It was a fresh installation so this should be possible for anyone to follow. Everything else was downloaded as part of these instructions.
Software Setup
Pre-Install
The system reqiores some setup to build support for the functionality discussed. Since this was a totally “fresh” system, so I merely had to install GCC and the Linux headers;
$ sudo apt-get install gcc
$ sudo apt-get install linux-headers-$(uname -r)CUDA Install
My notes state that I used “version 10” of the CUDA SDK …but … the link I saved reffers to 10.1u1?
Regardless, I setup the .deb file as I was instructed;
$ sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb
$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub
$ sudo apt-get update
$ sudo apt-get install cudaThis now needs setting up some environment variables setup.
But how do we do that?
Linux seems to change what the right way is for this three times a decade.
According to this, editing the .profile is in vouge so I added these lines to the end of the file.
export PATH=/usr/local/cuda-10.0/bin${PATH:+:${PATH}}
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}There were some instructions relating to POWER9 but I was unable to get anything to happen with them.
After a reboot, nvidia-smi showed me some kerbunk that made me believe that the GPU was/is operating correctly.
Docker
For $reasons you need Docker’s .deb package.
I found some setup instructions here which led to the following steps.
$ sudo apt update
$ sudo apt install apt-transport-https ca-certificates curl software-properties-common
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu bionic stable"
$ sudo apt update
$ apt-cache policy docker-ce
$ sudo apt install docker-ce
$ sudo systemctl status docker
$ sudo usermod -aG docker ${USER}nVidia-Docker
Nothing complex left, I followed the instructions;
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update
$ sudo apt-get install -y nvidia-docker2
$ sudo pkill -SIGHUP dockerdVerification
I ran the suggested command; looks like it’s running the nvidia-smi inside a throwaway container based on the nvidia/cuda:9.0-base image.
$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smiIt dumps the wall of text below.
psxpl3@glaze:~$ docker run --runtime=nvidia --rm nvidia/cuda:9.0-base nvidia-smi
Unable to find image 'nvidia/cuda:9.0-base' locally
9.0-base: Pulling from nvidia/cuda
7b722c1070cd: Pull complete
5fbf74db61f1: Pull complete
ed41cb72e5c9: Pull complete
7ea47a67709e: Pull complete
35400734fa04: Pull complete
195acf8a5739: Pull complete
127028f911f6: Pull complete
Digest: sha256:157d05b8a9f3a26dce71c9e824d3fab769d77326f471d0143a236c37d278450d
Status: Downloaded newer image for nvidia/cuda:9.0-base
Thu Feb 14 16:34:44 2019
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 410.79 Driver Version: 410.79 CUDA Version: 10.0 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Quadro K620 Off | 00000000:03:00.0 Off | N/A |
| 35% 44C P0 2W / 30W | 0MiB / 2001MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
… I still think we should be using some sort of OpenCL/GLES3/Vulkan rather than CUDA … #fiteme