Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Supplemental development process for Docker #1319

Merged
merged 9 commits into from Aug 24, 2022
2 changes: 2 additions & 0 deletions .gitignore
Expand Up @@ -27,3 +27,5 @@ sphinx_rtd_theme/static/js/html5shiv.min.js
sphinx_rtd_theme/static/js/html5shiv-printshiv.min.js
.direnv/
.envrc
# Used for dockerized builds
.container_id
58 changes: 58 additions & 0 deletions Dockerfile
@@ -0,0 +1,58 @@
# This implicitely includes Python 3.10
FROM node:14-alpine
Comment on lines +1 to +2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to use Ubuntu here if possible. Mainly to keep consistency with the rest of the repositories we have. Also, because I've found different weird compilation issues when working with Alpine and the saving in space is not really huge.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're on the same page here and Ubuntu images would be the best way to go.


# Do not use --update since that will also fetch the
# latest node-current package
# 'make' is needed for building documentation
RUN apk add npm make py3-pip py3-wheel

# Add an extra verification that we have the right node
# because the above caused issues
RUN node -v && node -v | grep -q v14 &&\
python3 --version && python3 --version | grep -q "3.10"

RUN pip install pip --upgrade

RUN mkdir -p /project/src/ &&\
mkdir -p /project/docs/build/ &&\
mkdir -p /project-minimal-copy/sphinx_rtd_theme &&\
touch /project-minimal-copy/sphinx_rtd_theme/__init__.py

# This is the main working directory where node_modules
# gets built. During runtime, it's mixed with directories
# from an external environment through a bind mount
WORKDIR /project

# Copy files necessary to run "npm install" and save
# installed packages in the docker image (makes the runtime
# so much faster)
COPY package.json /project/
COPY bin/preinstall.js /project/bin/preinstall.js

RUN cd /project

# It matters that the node environment is installed into the same
# folder, i.e. /project where we will run the environment from
RUN npm install --package-lock-only &&\
npm audit fix &&\
npm install

# This is strictly speaking not necessary, just makes
# running the container faster...
# Install dependencies, then uninstall project itself
COPY setup.py README.rst /project-minimal-copy/
RUN cd /project-minimal-copy &&\
pip install ".[dev]" &&\
/usr/bin/yes | pip uninstall sphinx_rtd_theme
humitos marked this conversation as resolved.
Show resolved Hide resolved


# Copy in files that we need to run the project. These files
# will not be mounted into the runtime from external environment
# so we copy them during build.
COPY webpack.common.js webpack.dev.js webpack.prod.js /project/
Comment on lines +49 to +52
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we don't mount them via docker compose instead of copying them here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can mount stuff at runtime. This is build time, so we need to copy into the image. The goal is to have node packages built together with the image.. however, if I copy the ENTIRE repo, then Docker's cache will be invalid everytime I build the image, so I have to build it over and over again.. so I copy the minimal set of files.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not sure I follow the need for avoiding mounts. Could you clarify what issue are you trying to solve? There is probably prior art to reuse in our application Docker configuration, whether that is permissions issues or avoiding host filesystem artifacts.

I would look at what we do in the application docker processes, as we don't seem to have the issues that this docker config is trying to solve. I would put strong preference on reusing patterns that are working for us already.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify what issue are you trying to solve?

Issue is to create a reusable and containerized environment for development. As the development currently also has responsibility for building the production assets, then that's also solved, but I understand that we want to remove these artifacts from Git as soon as possible.

I'd like to propose that the development container specification can be reused for CI. I can open an issue for that.

I would put strong preference on reusing patterns that are working for us already.

Or perhaps developing and improving those patterns? :)

The reason I put docker-compose with a bind mount was to echo what's in readthedocs.org, although I'd always recommend to keep the mounted set of files to an absolute minimum, especially so that files from the container aren't dumped on the host.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue is to create a reusable and containerized environment for development.

This seems like the overall goal of your change, but I'm more curious what issue you are hitting that requires this.

The main concern here is that we're making a complex solution for a problem we don't really have currently. The initial goal is to give contributors easy environment creation, which this setup does. But absolute isolation isn't currently a strong roadmap concern.

A longer term discussion point should be describing why isolation of build files is necessary for us. There are general arguments for this, but I'm not sure they are enough to bump up the priority of this work to an immediate team priority.

I'd like to propose that the development container specification can be reused for CI

What issue are you hitting that requires this change? The reasons to stick with the current pattern are that it's already providing isolation and deterministic builds, and there is more prior art and information on using CircleCI without custom Docker images.

So switching this feels a bit like sideways progress, but might also lead to us spending more time scratching our heads.

Or perhaps developing and improving those patterns?

It depends on the problems you are trying to solve, what design decision we come to, and where we place the work on our roadmap. The place to start changes like this is with discussion first, and I would suggest opening an issue on meta. From there, we'd find a place for the work on our roadmap if we all feel it's a priority.

Whatever we decide, sticking to a singular pattern is the main concern. One-off processes will add confusion and friction to maintenance.

Copy link
Member

@humitos humitos Aug 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benjaoming

You can mount stuff at runtime. This is build time, so we need to copy into the image

I don't think there is need to copy files into the container that are not used at build time. If these files are used at run time, they should be mounted at run time. Otherwise, modifying any of these files would require rebuilding the Docker image when it's not if they are mounted.

Actually, that's why I commented on this particular line and not the previous one (https://github.com/readthedocs/sphinx_rtd_theme/pull/1319/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R26-R28) where, to me, it's a justified used of COPY and makes total sense there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The files are used at build time, they are required for package.json and npm install to work. We could perhaps mock them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hrm... in that case, I don't follow it. These files are copied after running npm install in the Dockerfile: https://github.com/readthedocs/sphinx_rtd_theme/pull/1319/files#diff-dd2c0eb6ea5cfc6c4bd4eac30934e2d5746747af48fef6da689e85b752f39557R36-R38

Also, the comment right above copying these files says they are required to run the project.

Following these two things, I'd think they are not required at build time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm giving some pretty bad off-hand answers right now. I'll have to go over this again and give a proper answer. As I said, the idea is to avoid having anything mounted into the runtime environment to avoid making a mess in the host system's files.

I think this is easier to go through over a meeting than async.


# Copy in the entrypoint and we're done
COPY docker-entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh

ENTRYPOINT ["/entrypoint.sh"]
18 changes: 18 additions & 0 deletions Makefile
@@ -0,0 +1,18 @@
SHELL := /bin/bash
CWD := $(shell cd -P -- '$(shell dirname -- "$0")' && pwd -P)

docker-images:
docker-compose build

docker-npm-build:
rm -f .container_id
docker-compose run -d sphinx_rtd_theme build > .container_id
Comment on lines +8 to +9
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: there is no need to do rm -f since you are redirecting the output with > in the next command. If it exists, it will overwrite it with the new output.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

haha yes, will remove it.... 'twas because docker run could write to a file - now I need to detach in order to get the ID because docker-compose run doesn't have the same option that docker run has.

docker container wait "$(shell cat .container_id)"
docker cp "$(shell cat .container_id):/project/sphinx_rtd_theme" .
docker cp "$(shell cat .container_id):/project/package-lock.json" .
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And discussed elsewhere already, but just noting here, this homebrew configuration seems to overlap with what docker-compose provides. I'd put some preference on using standard tooling, as it's more approachable.

@humitos would have the most input/opinions here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know anything about homebrew, it's a MacOS thing right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea of docker cp is to avoid a write operation on a bind mount because file system permissions are usually troublesome, so it can be better to be explicit about what you expect to get from the docker runtime.

I'm already fixing this on your request @agjohnson

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now removed, the docker-compose command mounts the current git repo and builds inside that one.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know anything about homebrew, it's a MacOS thing right?

I meant this in the sense that this is a homebrewed process -- it is a manual, one-off creation, as opposed to a standardized, repeatable process using docker-compose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand what this step is for. Why are we copying these files from inside the container into the host?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to backtrack again from the approach where a host directory was mounted and written to by the container.

The reason was that I had issues with a mix of files owned by the user and files owned by root. I also needed to run npm cache clear and I never understood why.

But the fundamental reason was that I was mounting host system stuff into the container and writing irrelevant files there.

The best approach IMO is to keep everything containerized and copy out the artifacts directly from their known location. If you can separate out a bind mount and ensure that ONLY intended artifacts are written there, then that's also a fine approach.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha! Yeah, creating files inside the container that then are required outside the container usually generates permission problems. The only way to do that correctly is mapping the UID/GID between the host and the container and that is where things get more complicated. We've been there and I didn't enjoyed it 😄

I like the pattern you used here with docker cp for this requirement, where files generated inside the container are required to be distributed/used outside the container.

We currently have this problem when creating migrations on Read the Docs, for example, where the resulting .py files is owned by root and left in a directory owned by "your user" in the host. Then you have to run sudo chown to fix the permissions on that file.

@echo "Done building"

docker-npm-dev:
docker-compose run sphinx_rtd_theme dev

docker-build-all: docker-images docker-npm-build
26 changes: 26 additions & 0 deletions docker-compose.yaml
@@ -0,0 +1,26 @@
version: "3.2"
services:

sphinx_rtd_theme:
build: .
volumes:
- type: "bind"
source: "./"
target: "/project-readonly"
read_only: true
- type: "volume"
target: "/project-readonly/sphinx_rtd_theme.egg-info"
- type: "bind"
source: "./src"
target: "/project/src"
read_only: true
- type: "bind"
source: "./docs"
target: "/project/docs"
read_only: false #todo: fix this
- type: "volume"
target: "/project/docs/_build"

network_mode: host
ports:
- "1919:1919"
17 changes: 17 additions & 0 deletions docker-entrypoint.sh
@@ -0,0 +1,17 @@
#!/bin/sh

# Update latest Python dependencies in case they have changed
cd /project-readonly
pip install --upgrade -e ".[dev]"

# This helps a potential permission issue, but might be removed
# pending some more investigation of docker host file system
# permissions in the bind mount
# npm cache clean --force
# npm install

cd /project
cp -r /project-readonly/sphinx_rtd_theme .

echo "Going to invoke: npm run $@"
npm run $@
33 changes: 33 additions & 0 deletions docs/contributing.rst
Expand Up @@ -6,6 +6,9 @@ This project follows the Read the Docs :doc:`code of conduct
<rtd:code-of-conduct>`. If you are not familiar with our code of conduct policy,
take a minute to read the policy before starting with your first contribution.

.. tip::
There is a new dockerized build environment, see :ref:`dockerized-build`.

Modifying the theme
===================

Expand Down Expand Up @@ -62,6 +65,36 @@ can be used to test built assets:
.. _Wyrm: http://www.github.com/snide/wyrm/
.. _Sphinx: http://www.sphinx-doc.org/en/stable/


_dockerized-build::

Dockerized development
======================

If you have Docker available on your platform, you can get started building CSS and JS artifacts a bit faster and won't have to worry about any of the setup spilling over into your general environment.

When building with Docker, we create an image containing the build dependencies. Some of these are quite outdated and therefore ideal to isolate a container. The image is tagged as ``sphinx_rtd_theme:latest``.

Inside the running docker image, we mount the working copy of the repository, build the artifacts and finally observe that the artifacts have been built and left in your current git checkout.

Use the following steps:

.. code-block:: console

# Builds an updated version of the docker image
$ docker-compose build

# Runs the development webserver
$ docker-compose run sphinx_rtd_theme dev
Comment on lines +87 to +88
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't be better to use docker-compose up here which is the standard and call whatever is required behind the scenes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docker-compose run is doing what docker run does but reading the docker-compose configuration - I start the service that I want and give an argument dev to the entrypoint. docker-compose up is a different type of command.


# If you want to copy stuff out of the Docker environment, run this make
# target or read the actual Makefile to see what is going on.
# We suggest running this command every time that you want to quickly build
# new CSS/JS assets
$ make docker-build-all
Comment on lines +90 to +94
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using invoke to handle this type of extra commands in other repositories. I'm not opposite to use Makefile, tho, just saying 😄

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd also prefer to have a singular pattern to use, as much as I like make.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Containers provide a quick way to get started. make exists on most systems and doesn't sit on top of an additional stack that's customized by the project.

With invoke, we'd have to firstly bootstrap a Python environment with the correct version of invoke - or as a new contributor understand what invoke even is. I think there's a better chance that people can copy-paste make commands without worrying about setting up an environment for make.

If the feeling is "let's not have Make", then I would rather have a shell script for running the docker-build-all target, since that's the only command that chains a few things together and should be scripted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still agree that we should stick with a single pattern and use Invoke here. Both Python and Invoke are a normal part of our development workflow and already in use in other repositories, so are reasonable requirements for contributors. The goal is not to make this repository universally accessible as possible, it's just to make a repeatable environment that is easy for the core team to maintain.


Every time you change the Node or Python requirements, you will need to rebuild images with ``docker-compose run sphinx_rtd_theme build``. If you change SASS or JS, you will need to rebuild assets.

Testing
=======

Expand Down