Ever run into a situation where you need to update just one dependency in a docker image? I ran into this while working on this Nextflow pipeline made from the nf-core template. In this template, the
environment.yml is used to specify the dependencies, which is awesome for reproducibility and portability, as
conda works with virtually any system. But, if you’re developing, the build time of the images is ~1 hour to resolve all the dependencies, which really reduces the developer’s iteration time, and thus the efficiency of development. I personally really hate context switching between tasks and prefer to do deep work on one thing at a time. As Cal Newport says, our brains our not multithreaded like computers are.
So the workflow would look like:
So you’re writing a custom Python package on a Nextflow workflow, then you are constantly updating the underlying code, and don’t want to have to wait for all the dependencies to get resolved.
So now my workflow looks like this:
Run the docker container with your installation command. In my case, this is
pip install -U git+https://github.com/czbiohub/kh-tools.git@olgabot/more-encodings#egg=khtools
(kh-tools--more-encodings) Wed 19 Feb - 08:08 ~/code/nf-predictorthologs origin ☊ olgabot/initial-outline ↑5 3● docker run czbiohub/predictorthologs:dev pip install -U git+https://github.com/czbiohub/kh-tools.git@olgabot/more-encodings#egg=khtools
docker ps -l to show all docker processes, since this container is still running.
Wed 19 Feb - 08:21 ~/code/nf-predictorthologs origin ☊ olgabot/initial-outline ↑7 1● docker ps -l CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES a4e7c3f7eed3 czbiohub/predictorthologs:dev "pip install -U git+…" 33 seconds ago Exited (0) 28 seconds ago brave_heisenberg
Copy that commit hash, in this case
a4e7c3f7eed3, and commit the changes to the container using
(kh-tools--more-encodings) Wed 19 Feb - 08:21 ~/code/nf-predictorthologs origin ☊ olgabot/initial-outline ↑7 1● docker commit a4e7c3f7eed3 czbiohub/predictorthologs:dev sha256:af9153554838c8bea54260a3c8ab284e9b093b687c5f6b732b1a8c3c866da8b8
It doesn’t seem that there’s a single one-line command one can run to run and commit to a new image, other than using a
Dockerfile and using your existing container with
FROM czbiohub/predictorthologs:dev at the top. But then you run into overwriting the docker image tag with a new one, and the container name is hard-coded into the Nextflow pipeline… so I’d rather not.
Hope that helps you!