When I set up a new deep learning environment in a docker image, especially for sharing the image while keeping it as light as possible, I use the following steps:

Pulling a clean Ubuntu image with conda

$ docker pull codeslake/ubuntu18.04-conda:base

Installing PyTorch and Cuda Toolkit

# Run docker container
$ nvidia-docker run --privileged -it -p 9005:9005 -v /var/run/docker.sock:/var/run/docker.sock -v /home/junyonglee:/root -v /Kiwi:/Kiwi -v /Jarvis:/Jarvis -v /Mango:/Mango -v /data1:/data1 -v /data2:/data2 -e TERM=`echo $TERM` -e LANG="en_US.UTF-8" -e LANGUAGE="en_US.UTF-8" -e LC_ALL="en_US.UTF-8" -h $HOST -w /root -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all --shm-size 100G --name temp --rm codeslake/ubuntu18.04-conda:base /bin/zsh

# Create conda environment
$ conda create --name pt1.10.2 python=3.8
$ conda activate pt1.10.2

# Install PyTorch
$ conda install pytorch torchvision torchaudio cudatoolkit=11.3 -c pytorch

# Install Cuda Tookit
$ wget https://developer.download.nvidia.com/compute/cuda/11.3.0/local_installers/cuda_11.3.0_465.19.01_linux.run
$ sh ./cuda_11.3.0_465.19.01_linux.run --no-opengl-libs --toolkit --silent --override

Commit docker image and push to docker hub

Detach the docker container, and follow the below:

$ docker commit -a "junyonglee" -m "pt1.10.2" temp codeslake/ubuntu18.04-conda:pt1.10.2_CUDA11.3_nvcc_clean
$ docker push codeslake/ubuntu18.04-conda:pt1.10.2_CUDA11.3_nvcc_clean

References

  1. How to install nvcc for conda-installed PyTorch in Ubuntu
  2. Installing nvidia driver, CUDA, and cuDNN for Ubuntu

Leave a comment