Skip to content

Commit

Permalink
Dan dec update (#3)
Browse files Browse the repository at this point in the history
* Improve local var name

* Remove unused _content instance var on Resource

Previous refactoring made obivated the _content instance var

* Make a method a function (static method) and pass in state

Prefer function over method when it makes sense.

Prefer properties for book_number and book_title because they are
derivative values that are only retrieved and don't need to be set.
The overarching theme of this change is to lean up __init__ and state
in general.

* Better organize property methods

Group related property methods together

* Remove unused _verses_html instance variable

* Rename _content instance var to _html_content

Better naming

* More accurate typing

Make mypy and pyre happy with more precise types. This actually caught
a potential bug in assembly_strategies.py too in which tn_verses and
tq_verses could be accessed before being defined due to conditional
logic scope blocks. In practice this never happened, but it was
overlooked without stricter typing via pyre and mypy --strict.

* Sort imports

isort sorted

* More accurate typing

Continue work to make mypy --strict work

* Remove unnecessary classmethod

* Replace use of __getattr__ with explicit delegation

* Reverse experiment of having *HtmlInitializer classes

* Ignore some flake8 lint rules

* Prefer short expressions over function call encapsulating them

* Use generators to simplify

Use of generators in this case achieves the following goals:
- More functionalish code: fewer local variables.
- Better use of memory by not using so many list appends.

* Function renaming

Leading underscore has been used to indicate protected variables and
methods in an OO context. Refactor of document_generator toward a more
functional scheme obviates the need for this, thus remove leading
underscore.

* Simplify Resource state

Remove a couple instance variables whose state can be supplied in
another way, i.e., prefer small objects with little state.

Prefer derived properties over object state.

* Update Makefile targets

Add a new smoke test that is useful during development without having
to run longer test suites.

Update some target preconditions.

Fix a .PHONY name.

* Improve source comment

* Fix docker-compose gunicorn stanza

Make docker-compose up work. gunicorn stanza in docker-compose.yml was
missing pythonpath cli switch and value.

* Add missing types

Make mypy --strict happy.

* Refactor to functional (in the mathematical sense) paradigm

Use more immutable data types like Sequence where warranted. Prefer
immutability over mutability where possible.

Use generators in a few more places where it cleans up the code.

Make the top level exception handling code more robust by moving
initialization of success message into else block of try/except to
prevent the possibility (detected during this refactor) of a success
message getting set when in fact failure had occurred.

Add a test wherein at least one resource request is unloadable and another
unfindable while other resource requests in the document request are
fine. This exercises the case where we still want to render the PDF
document with what resources were available while reporting (on the
cover of the PDF) those that were not.

Handle more USFM resources. Specifically, handle USFM resources whose
resource asset files are organized with one directory per chapter and
one file per chapter.

Compile most modules with mypyc for at least 2x speedup. mypyc
compiled modules types are checked at runtime so we also don't need
typeguard. Also not using icontract because mypyc doesn't like its use
of lambdas since lambdas can't be typed by mypy.

Change source layout organization: src -> backend.

* Small change to directory layout

Rename api -> backend. Dockerfile -> backend/Dockerfile.
Reason for this change is to prepare for also having frontend in the
same repo each with their own Dockerfiles but composed with
docker-compose into one container.

* Update a few source code comments

* Add frontend submodule and associated config

Docker container for backend and frontend composed into one container
using docker-compose.

* Return lang codes and names sorted alphabetically by name

Also add the language code to the displayed value at least for now
because then it makes it easier to discover lang codes for the purpose
of creating tests that are first discovered manually.

* Use exception handler approach for top level exception scope

Follow fastapi's example for this: https://fastapi.tiangolo.com/tutorial/handling-errors/

* mypyc Makefile target doesn't rely on mypy target

* Pull latest commits from frontend submodule

* Pull latest submodule frontend commits

* Pull latest commits from frontend submodule

* Update docker-compose to make back and front ends communicate

You can now do:

make build
make frontend-server

to get a locally accessible frontend http://localhost:8000 (which
talks to the backend).

* Make optional RUN_TESTS functionality work in docker build

* Better docs and comment out docker volume for PDFs

Since we have a frontend that serves up results, we no longer need to
map PDFs from the container to the host system to be able to view the
generated PDFs.

* Update docs

* Pull latest commits on frontend submodule

* Handle case where English tn and tn-wa are requested

translations.json offers English translation notes in two locations: git repo
and downloadable zip. If both were requested through the UI then
filter out the tn format and use the repo only. We don't want them
both in the same document as they are actually the same material.

Update a few source comments and docstrings.

Make isort happy by removing imports that are no longer used.

* Add clean-mypyc-artifacts Makefile target

mypyc creates C binaries. When making
changes and verifying them in a quick edit-test-run development
workflow having to recompile using mypyc in between test runs slows
down the flow of work. This Makefile target can be used to quickly
remove the generated binaries so that we are back to running
interpreted Python temporarily. Then when we have our tests passing
and changes we can go back to compiling source with mypyc via make
mypyc.

* Clean up Jinja2 templates

Remove extraneous html elements, comment out unused css classes,
remove unused template.html, add css classes that will allow two
column layout.

* Update log message

Make log message less misleading

* Upgraded Python packages

* Remove some vars from .env file

I don't want a couple key directories /working/temp and
/working/output to be easily changed so I have removed them from being
initialized in .env. They are now just initialized in config.py.

Also removed some unused and commented out env vars from .env.

* Fix a Makefile target dependency relationship

* Update docstrings and a few source comments

Docstrings needed updating to match current functionality.

* Better organize and simplify some code

Move some methods out of Settings class in config.py into
resource_lookup and document_generator modules where it belongs.

* Two column layout for scripture verses and their helps

Puts USFM resources on left 50% of layout and their associated helps
on the right 50% of the layout. The rest of the layout is not
subdivided.

* Update design.org design doc

Remove some obselete sections, update a few others. More to do still.

* Add pygout package

Used for generating project stats in design doc.

* Update Python packages

pip-compile --upgrade automatic upgrades to packages.

* Update Docker image to Python 3.10.1

* Update some docstrings and source comments

* Avoid use of magic string

* Go with Python 3.10 union | use

* Workaround for mypyc regression upstream

I reported this here: mypyc/mypyc#912

* Add generate-class-diagrams Makefile target

Uses pyreverse which comes with pylint package to generate UML class
diagrams from source code.

* Workaround for mypyc regression upstream part 2

Fix to make pip-compile generate the requirements-prod.txt with
mypy (and thus mypyc) pinned at an older (working) version.

* Copy generated PDFs to own directory...

...for mapping into host file system.

* Update a few source comments

* Tiny bit of automatic code formatting via black

* Only start translation word definition section if words present

Fix issue where a language that did not provide translation words
would at least print a header for its section despite there not being
any content.

* Rename a few attributes

Done for consistency and simplicity

* Show the max number of chapters and verses available

Prior to this commit you could get an undesirable state of affairs in
book then language assembly wherein if the first usfm language/book
resource was missing a verse, then other usfm language/book resources
in that same document request would skip that verse because he
chapter and verse loops were arbitrarily based off the first usfm
resource. Now the usfm resource with the most chapters for a
particular book and subsequently the usfm resource with the most
verses within each respective chapter are used as the outer loops
pumps for chapters and verses repsectively. This way the max available
chapters and verses are displayed in the resulting document.

* Use final type decorator to indicate design

Mypy then enforces that classes so decorated are not subclass-able (by
design).

* Add a log message for a possible exception point

Easier diagnosis using logs.

* Fix footnotes regression

Footnotes were not being shown with their associated verse and thus
the footnotes section at the end of the chapter showed footnotes but
they were not clickable. This is fixed now.

* Move custom exceptions to own module

Better organization and now we can use the module in other modules
where its exceptions are needed.

* Better naming

Follow Pythonic convention and conventions of good style to name
functions after what they return rather than names which imply how
they do that. Also avoiding nebulous names like initialize...blah.

* Tighter control of acceptable resource types requested

Make sure we don't accept resource types that we do not support yet or
which are not valid.

* Stricter adherence to functional style

Don't use "globals" even if they are just a module reference when it
can be passed as a defaulted function argument instead. Pass in what
we need as parameters to keep things more pure. I say more because we
aren't yet to a place with the code that doesn't have side-effects,
that is coming though.

* Skip tests that test functionality not yet supported

Skip tests that deal with TA resource type as we do not support it
yet.

* Minor tweak to docs

* Update many source code comments

* Change magic number to constant

* Greater referential transparency

Use more dependency injection at the function parameter level via
defaulted arguments so as to increase the referential transparency of
the code and make it more easily testable.

* Improve a log message slightly

* Remove unused imports

* Prune unused exception variable

* Make e2e-tests suite use all language codes when run outside Docker

So that more more random language combinations will be discovered by
randomized test fixtures.

Also use Sequence instead of list type in order to communicate
immutable use of collection.

* Skip two tests that fail in Docker

Investigate further later.

* Tell jinja2 to autoescape HTML

To help prevent potential XSS attacks.

* Fix mypy type error and preserve immutability

Instead of extending a collection, unpack the two collections into one
new collection.

* Update submodule pointer

...to not point to old
submodule (https://github.com/linearcombination/InterleavedResourcesGeneratorUI).

* New submodule pointer

...pointing to https://github.com/WycliffeAssociates/DOC-UI

Co-authored-by: linearcombination <4829djaskdfj@gmail.com>
  • Loading branch information
danparisd and linearcombination committed Jan 13, 2022
1 parent 4478e55 commit c14d9f8
Show file tree
Hide file tree
Showing 77 changed files with 7,611 additions and 7,862 deletions.
28 changes: 11 additions & 17 deletions .env
Expand Up @@ -12,10 +12,6 @@
# Used in Dockerfile to get wkhtmltox
WKHTMLTOX_LOC=https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.buster_amd64.deb

# Location where resource assets will be downloaded
RESOURCE_ASSETS_DIR=/working/temp
# Location where generated PDFs will be written to
DOCUMENT_OUTPUT_DIR=/working/output
# Location where the api finds translations.json
TRANSLATIONS_JSON_LOCATION=http://bibleineverylanguage.org/wp-content/themes/bb-theme-child/data/translations.json
LOGGING_CONFIG=src/document/logging_config.yaml
Expand All @@ -37,7 +33,8 @@ SUCCESS_MESSAGE="Success! Please retrieve your generated document using a GET RE
# Return the message to show to user on failure generating PDF.
FAILURE_MESSAGE="The document request could not be fulfilled either because the resources requested are not available either currently or at all or because the system does not yet support the resources requested."

# Sending emails if off by default due to automated testing
# Sending emails if off by default due to automated testing, turn it
# on for production use
SEND_EMAIL=false
FROM_EMAIL_ADDRESS=fake@example.com
SMTP_PASSWORD=realpasswordgoeshere
Expand All @@ -50,24 +47,21 @@ SMTP_PORT=587
# We are running in the container. This is used by the system to
# determine the location of the working and output directories.
IN_CONTAINER=true
# The port to pass to gunicorn via ./backend/gunicorn.conf.py
PORT=5005
# If true, run pytest test suite during docker build of backend container to verify correctness and
# to generate assets which preheat cache. Note that it takes about 30
# minutes for the test suite to conclude, hence it is set to false by
# default.
RUN_TESTS=false
# Control caching of resource assets to save on network traffic

ENABLE_ASSET_CACHING=true
# Caching window of time in which cloned or downloaded resource asset
# files on disk are considered fresh rather than reacqiring them. In hours.
ASSET_CACHING_PERIOD=72

# Just samples, you need to set these (remember: JSON formatted)
BACKEND_CORS_ORIGINS='["http://localhost", "http://localhost:8080"]'

# Currently unused
# PYTHONDONTWRITEBYTECODE=1 # Incompatible with optimization in production.
# PYTHONUNBUFFERED=1 # Not sure we want this.
# PYTHONOPTIMIZE=1 # In particular, this would, for one thing, turn
# off icontract checks for production:
# https://icontract.readthedocs.io/en/latest/usage.html#toggling-contracts.
# Not something I necessarily want to do unless things are reeeaaally
# slow.
BACKEND_CORS_ORIGINS='["http://localhost", "http://localhost:8000"]'

#local image tag for local dev with prod image
IMAGE_TAG=local

2 changes: 2 additions & 0 deletions .gitignore
@@ -1,3 +1,5 @@
/src/document.egg-info/
/**/__pycache__
/**/.mypy_cache/
/**/**/**/*-darwin.so
/**/**/**/.hypothesis/
3 changes: 3 additions & 0 deletions .gitmodules
@@ -0,0 +1,3 @@
[submodule "frontend"]
path = frontend
url = https://github.com/WycliffeAssociates/DOC-UI
54 changes: 39 additions & 15 deletions Dockerfile
@@ -1,17 +1,19 @@
FROM python:3.9.7-slim-buster
FROM python:3.10.1-slim-bullseye

RUN apt-get update && apt-get install -y \
wget \
curl \
fontconfig \
fonts-noto-cjk \
git \
unzip \
# Next packages are for wkhtmltopdf
fontconfig \
fonts-noto-cjk \
libxrender1 \
xfonts-75dpi \
xfonts-base \
libjpeg62-turbo
libjpeg62-turbo \
# For mypyc
gcc

# Get and install needed fonts.
RUN cd /tmp \
Expand All @@ -26,8 +28,9 @@ RUN fc-cache -f -v
# How to get wkhtmltopdf - don't use what Debian provides as it can have
# headless display issues that mess with wkhtmltopdf.

# Make a build arg available to this Dockerfile with default
# Make a build arg available to this Dockerfile with default. Default will be overriden by environment var if used.
ARG WKHTMLTOX_LOC=https://github.com/wkhtmltopdf/packaging/releases/download/0.12.6-1/wkhtmltox_0.12.6-1.buster_amd64.deb

RUN WKHTMLTOX_TEMP="$(mktemp)" && \
wget -O "$WKHTMLTOX_TEMP" ${WKHTMLTOX_LOC} && \
dpkg -i "$WKHTMLTOX_TEMP" && \
Expand All @@ -38,24 +41,45 @@ RUN WKHTMLTOX_TEMP="$(mktemp)" && \
RUN mkdir -p /working/temp
# Make the output directory where generated HTML and PDFs are placed.
RUN mkdir -p /working/output
# Make the directory where logs are written to.
# Make the output directory where generated PDFs are copied too.
RUN mkdir -p /pdf_output

COPY .env .
COPY icon-tn.png .
COPY gunicorn.conf.py .
COPY ./backend/gunicorn.conf.py .

# See https://pythonspeed.com/articles/activate-virtualenv-dockerfile/
# for why a Python virtual env is used inside Docker.
ENV VIRTUAL_ENV=/opt/venv
RUN python -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

COPY requirements.txt .
COPY requirements-dev.txt .
RUN pip install -r requirements.txt
RUN pip install -r requirements-dev.txt
COPY ./backend/requirements.txt .
COPY ./backend/requirements-prod.txt .
RUN pip install --no-cache-dir -r requirements.txt
RUN pip install --no-cache-dir -r requirements-prod.txt

COPY ./backend/ /backend/
COPY ./tests/ /tests/
# COPY ./language_codes.json /tests/

# Inside the Python virtual env: check types, install any missing mypy stub
# types packages, and compile most modules into C using mypyc
RUN cd $VIRTUAL_ENV && . $VIRTUAL_ENV/bin/activate && mypyc --strict --install-types --non-interactive /backend/document/**/*.py

# Make sure Python can find the code to run
ENV PYTHONPATH=/backend:/tests

COPY ./src/ /src/
COPY ./tests /tests
# Run tests to verify correctness and (mainly) to generate assets for preheating cache
# To run the tests do: docker-compose build --build-arg run_tests=1
# Make RUN_TESTS in .env and referenced in docker-compose.yml
# available here.
ARG RUN_TESTS=false
RUN if [ "$RUN_TESTS" = "true" ] ; then IN_CONTAINER=true pytest /tests/ ; else echo You have chosen to skip the test suite ; fi

ENV PYTHONPATH=/src:/tests
# Make PORT in .env and referenced in docker-compose.yml
# available here.
ARG PORT=5005
# What gets run when 'docker-compose run backend' is executed.
CMD ["gunicorn", "--name", "document:entrypoints:app", "--worker-class", "uvicorn.workers.UvicornWorker", "--pythonpath", "/backend", "--conf", "/backend/gunicorn.conf.py", "document.entrypoints.app:app"]

CMD ["gunicorn", "--worker-class", "uvicorn.workers.UvicornWorker", "--config", "./gunicorn.conf.py", "document.entrypoints.app:app"]

0 comments on commit c14d9f8

Please sign in to comment.