Skip to content

Docker for users #1214 #1218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Aug 20, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions docker/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# Convenient docker images

In this folder we provide Dockerfiles to build docker images with PyTorch, Ignite and other convenient libraries:

- [basic](basic/Dockerfile): latest stable PyTorch, latest stable Ignite, OpenCV, Albumentations, Nvidia/Apex, etc
Copy link
Collaborator

@vfdev-5 vfdev-5 Aug 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the latest version of PyTorch is 1.6.0 and it contains torch native amp. So, no need to install Nvidia/Apex.

Let's provide several docker images (and also rename basic to vision-basic (if any suggestions to rename it differently, let's see))

  • [vision-basic](vision-basic/Dockerfile): latest stable PyTorch, latest stable Ignite, OpenCV, Albumentations, etc
  • [apex-vision-basic](apex-vision-basic/Dockerfile.apex): latest stable PyTorch, latest stable Ignite, OpenCV, Albumentations and Nvidia/Apex etc

Those images will have the names:

docker pull pytorchignite/vision-basic:latest
docker pull pytorchignite/apex-vision-basic:latest

If there are any other suggestions on how to regroup things, let's discuss :)
What do you think ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!
The term basic is maybe not mandatory, images name could be just ( but depends if there will other possible prefix ? ) :
pytorchignite/vision:latest and pytorchignite/apex-vision:latest

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, we can omit "basic". For apex, do you know how to build docker image in multi-steps to reduce image size ? Something like that :

  • start from pytorch devel image
  • build apex
  • somehow get the artifact
  • start from pytorch runtime image
  • copy the artifact
  • install artifact

https://docs.docker.com/develop/develop-images/multistage-build/

The last two points I do no know how to perform, that's why I ask...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't built an image in multi-steps yet, though I can give it a try. For apex artifact would you recommend conda-pack in order to create a relocatable environment in the runtime image or have you another idea on how to do that ?

Copy link
Collaborator

@vfdev-5 vfdev-5 Aug 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, I don't know. Maybe we have more control by rebuilding it by ourselves...

<!-- For example, this dockerfile I'll provide once Horovod is done -->
- [ignite-hvd](ignite-hvd/Dockerfile) : latest stable PyTorch, latest stable Ignite, Horovod
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove this in this PR

64 changes: 64 additions & 0 deletions docker/basic/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
FROM pytorch/pytorch:1.5.1-cuda10.1-cudnn7-devel
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please, take the latest distribution of PyTorch


#install git
RUN apt-get update && apt-get install -y --no-install-recommends \
git && \
rm -rf /var/lib/apt/lists/*

#setup opencv
RUN echo '#!/bin/bash\n\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no opencv in pytorch/pytorch:1.5.1-cuda10.1-cudnn7-devel, so we can just install the libs and set the timezone without checking python -c "import cv2".

python -c "import cv2"\n\
res=$?\n\
if [ "$res" -eq "1" ]; then\n\
echo "Install libglib2.0 libsm6 libxext6 libxrender-dev for opencv"\n\
apt-get update\n\
ln -fs /usr/share/zoneinfo/America/New_York /etc/localtime\n\
apt-get install -y tzdata\n\
dpkg-reconfigure --frontend noninteractive tzdata\n\
apt-get -y install --no-install-recommends libglib2.0 libsm6 libxext6 libxrender-dev\n\
fi\n'\
>> setup_opencv.sh

RUN sh setup_opencv.sh

#setup apex
RUN echo '#!/bin/bash\n\
tmp_apex_path="/tmp/apex"\n\
python -c "import apex"\n\
res=$?\n\
if [ "$res" -eq "1" ]; then\n\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. Just install Nvidia/apex. Maybe, TORCH_CUDA_ARCH_LIST can be and argument.

echo "Setup NVIDIA Apex"\n\
rm -rf $tmp_apex_path\n\
git clone https://github.com/NVIDIA/apex $tmp_apex_path\n\
cd $tmp_apex_path\n\
export TORCH_CUDA_ARCH_LIST="6.0;6.1;6.2;7.0;7.5"\n\
pip install --upgrade --no-cache-dir --global-option="--cpp_ext" --global-option="--cuda_ext" .\n\
fi\n'\
>> setup_apex.sh

RUN sh setup_apex.sh

RUN echo 'albumentations\n\
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, we can install deps with a single RUN without writting temporary files...

image-dataset-viz\n\
numpy\n\
opencv-python\n\
py_config_runner\n\
pytorch-ignite\n\
pillow\n\
tensorboard\n\
tqdm\n\
trains>=0.15.0\n'\
>> requirements.txt

RUN pip install --upgrade --no-cache-dir -r requirements.txt

#create a non-privileged docker user (for debian)
RUN groupadd -g 61000 ignite \
&& useradd -g 61000 -l -M \
-s /sbin/nologin -u 61000 ignite

RUN chown -R ignite:ignite /workspace

USER ignite

ENTRYPOINT ["/bin/sh"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, bash instead of sh ?