Ever run into a situation where you need to update just one dependency in a docker image? I ran into this while working on this Nextflow pipeline made from the nf-core template. In this template, the environment.yml
is used to specify the dependencies, which is awesome for reproducibility and portability, as conda
works with virtually any system. But, if you’re developing, the build time of the images is ~1 hour to resolve all the dependencies, which really reduces the developer’s iteration time, and thus the efficiency of development. I personally really hate context switching between tasks and prefer to do deep work on one thing at a time. As Cal Newport says, our brains our not multithreaded like computers are.
So the workflow would look like:
So you’re writing a custom Python package on a Nextflow workflow, then you are constantly updating the underlying code, and don’t want to have to wait for all the dependencies to get resolved.
Enter hot-installing. I just made that up. In web development, there’s “hot reloading” which refreshes the HTML page every time you touch the Javascript/CSS/HTMl that goes into making it. This is somewhat similar, but now when I make a change in a depending Python Package, instead of taking hours to rebuild the docker image, I can quickly reinstall that one package from GiHub, and move on my merry way.
So now my workflow looks like this:
Run the docker container with your installation command. In my case, this is pip install -U git+https://github.com/czbiohub/kh-tools.git@olgabot/more-encodings#egg=khtools
(kh-tools--more-encodings)
Wed 19 Feb - 08:08 ~/code/nf-predictorthologs origin ☊ olgabot/initial-outline ↑5 3●
docker run czbiohub/predictorthologs:dev pip install -U git+https://github.com/czbiohub/kh-tools.git@olgabot/more-encodings#egg=khtools
Next, use docker ps -l
to show all docker processes, since this container is still running.
Wed 19 Feb - 08:21 ~/code/nf-predictorthologs origin ☊ olgabot/initial-outline ↑7 1●
docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
a4e7c3f7eed3 czbiohub/predictorthologs:dev "pip install -U git+…" 33 seconds ago Exited (0) 28 seconds ago brave_heisenberg
Copy that commit hash, in this case a4e7c3f7eed3
, and commit the changes to the container using docker commit
.
(kh-tools--more-encodings)
Wed 19 Feb - 08:21 ~/code/nf-predictorthologs origin ☊ olgabot/initial-outline ↑7 1●
docker commit a4e7c3f7eed3 czbiohub/predictorthologs:dev
sha256:af9153554838c8bea54260a3c8ab284e9b093b687c5f6b732b1a8c3c866da8b8
It doesn’t seem that there’s a single one-line command one can run to run and commit to a new image, other than using a Dockerfile
and using your existing container with FROM czbiohub/predictorthologs:dev
at the top. But then you run into overwriting the docker image tag with a new one, and the container name is hard-coded into the Nextflow pipeline… so I’d rather not.
Hope that helps you!