Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build R documentation using Docker #5188

Merged
merged 6 commits into from Dec 22, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
54 changes: 2 additions & 52 deletions .circleci/config.yml
Expand Up @@ -2,56 +2,11 @@ version: 2.1

jobs:
build_doc_r:
docker:
- image: ubuntu:20.04
machine:
image: ubuntu-2004:202111-01
Comment on lines -5 to +6
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


steps:
- checkout
- run:
name: Install dev tools
environment:
DEBIAN_FRONTEND: noninteractive
command: |
apt-get update --yes
apt-get install sudo git wget curl jq software-properties-common apt-transport-https --yes

- run:
name: Install Java
command: |
sudo apt-get install default-jdk --yes
java -version

- run:
name: Install R
command: |
# How To Install R on Ubuntu 20.04:
# https://www.digitalocean.com/community/tutorials/how-to-install-r-on-ubuntu-20-04

sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E298A3A825C0D65DFD57CBB651716619E084DAB9
sudo add-apt-repository 'deb https://cloud.r-project.org/bin/linux/ubuntu focal-cran40/'
sudo apt update -y
sudo apt install -y r-base libssl-dev libxml2-dev libcurl4-openssl-dev
R --version

- run:
name: Install pandoc
command: |
# Install a recent version of pandoc
TEMP_DEB="$(mktemp)"
wget -O "$TEMP_DEB" 'https://github.com/jgm/pandoc/releases/download/2.7.2/pandoc-2.7.2-1-amd64.deb'
sudo dpkg -i "$TEMP_DEB"
rm -f "$TEMP_DEB"

- run:
name: Dump R dependencies
working_directory: mlflow/R/mlflow
command: |
Rscript .dump-r-dependencies.R

- restore_cache:
keys:
- r-cache-{{ checksum "mlflow/R/mlflow/R-version" }}

Comment on lines -10 to -54
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed these steps because the docker image contains all required tools.

- run:
name: Build documentation
working_directory: docs
Expand Down Expand Up @@ -80,11 +35,6 @@ jobs:

exit $failed

- save_cache:
key: r-cache-{{ checksum "mlflow/R/mlflow/R-version" }}-{{ checksum "mlflow/R/mlflow/depends.Rds" }}
paths:
- /usr/local/lib/R/site-library

- store_artifacts:
path: << pipeline.git.revision >>.patch

Expand Down
30 changes: 9 additions & 21 deletions docs/build-rdoc.sh
@@ -1,28 +1,16 @@
#!/usr/bin/env bash

set -ex

pushd ../mlflow/R/mlflow

# `gert` requires `libgit2`:
# https://github.com/r-lib/gert#installation
sudo add-apt-repository ppa:cran/libgit2
sudo apt-get update
sudo apt-get install --yes libssh2-1-dev libgit2-dev
image_name="mlflow-r-dev"
docker build -f Dockerfile.dev -t $image_name .
docker run \
--rm \
-v $(pwd):/mlflow/mlflow/R/mlflow \
-v $(pwd)/../../../docs/source:/mlflow/docs/source \
$image_name \
Rscript -e 'source(".build-doc.R", echo = TRUE)'

Rscript -e 'install.packages("devtools", repos = "https://cloud.r-project.org")'
Rscript -e 'devtools::install_dev_deps(dependencies = TRUE)'
# Install Rd2md from source as a temporary fix for the rendering of code examples, until
# a release is published including the fixes in https://github.com/quantsch/Rd2md/issues/1
# Note that this commit is equivalent to commit 6b48255 of Rd2md master
# (https://github.com/quantsch/Rd2md/tree/6b4825579a2df8a22898316d93729384f92a756b)
# with a single extra commit to fix rendering of \link tags between methods in R documentation.
Rscript -e 'devtools::install_git("https://github.com/smurching/Rd2md", branch = "mlflow-patches")'
Rscript -e 'install.packages("rmarkdown", repos = "https://cloud.r-project.org")'
rm -rf man
Rscript -e "roxygen2::roxygenise()"
# remove mlflow-package doc temporarily because no rst doc should be generated for it.
rm man/mlflow-package.Rd
Rscript document.R
# roxygenize again to make sure the previously removed mlflow-packge doc is available as R helpfile
Rscript -e "roxygen2::roxygenise()"
popd
119 changes: 64 additions & 55 deletions docs/source/R-api.rst
Expand Up @@ -899,8 +899,8 @@ Arguments
| Argument | Description |
+===============================+======================================+
| ``flavor`` | An MLflow flavor object loaded by |
| | `mlflow_load_model <#mlflow-load-mod |
| | el>`__ |
| | `mlflo |
| | w_load_model <#mlflow-load-model>`__ |
| | , with class loaded from the flavor |
| | field in an MLmodel file. |
+-------------------------------+--------------------------------------+
Expand Down Expand Up @@ -1309,13 +1309,16 @@ to be used by package authors to extend the supported MLflow models.
Arguments
---------

========= ===================================================================
Argument Description
========= ===================================================================
``model`` The loaded MLflow model flavor.
``data`` A data frame to perform scoring.
``...`` Optional additional arguments passed to underlying predict methods.
========= ===================================================================
+-----------+---------------------------------------------------------+
| Argument | Description |
+===========+=========================================================+
| ``model`` | The loaded MLflow model flavor. |
+-----------+---------------------------------------------------------+
| ``data`` | A data frame to perform scoring. |
+-----------+---------------------------------------------------------+
| ``...`` | Optional additional arguments passed to underlying |
| | predict methods. |
+-----------+---------------------------------------------------------+
Comment on lines +1312 to +1321
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the version of pandoc to 2.16.2 from 2.7.1, which added some changes in this file.


``mlflow_register_external_observer``
=====================================
Expand Down Expand Up @@ -1726,15 +1729,21 @@ model types.
Arguments
---------

============== ==================================================================
Argument Description
============== ==================================================================
``model`` The model that will perform a prediction.
``path`` Destination path where this MLflow compatible model will be saved.
``model_spec`` MLflow model config this model flavor is being added to.
``...`` Optional additional arguments.
``conda_env`` Path to Conda dependencies file.
============== ==================================================================
+----------------+----------------------------------------------------+
| Argument | Description |
+================+====================================================+
| ``model`` | The model that will perform a prediction. |
+----------------+----------------------------------------------------+
| ``path`` | Destination path where this MLflow compatible |
| | model will be saved. |
+----------------+----------------------------------------------------+
| ``model_spec`` | MLflow model config this model flavor is being |
| | added to. |
+----------------+----------------------------------------------------+
| ``...`` | Optional additional arguments. |
+----------------+----------------------------------------------------+
| ``conda_env`` | Path to Conda dependencies file. |
+----------------+----------------------------------------------------+

``mlflow_search_runs``
======================
Expand Down Expand Up @@ -1836,42 +1845,6 @@ Arguments
| | the path of all static paths. |
+-------------------------------+--------------------------------------+

``mlflow_set_experiment``
=========================

Set Experiment

Sets an experiment as the active experiment. Either the name or ID of
the experiment can be provided. If the a name is provided but the
experiment does not exist, this function creates an experiment with
provided name. Returns the ID of the active experiment.

.. code:: r

mlflow_set_experiment(
experiment_name = NULL,
experiment_id = NULL,
artifact_location = NULL
)

.. _arguments-43:

Arguments
---------

+-------------------------------+--------------------------------------+
| Argument | Description |
+===============================+======================================+
| ``experiment_name`` | Name of experiment to be activated. |
+-------------------------------+--------------------------------------+
| ``experiment_id`` | ID of experiment to be activated. |
+-------------------------------+--------------------------------------+
| ``artifact_location`` | Location where all artifacts for |
| | this experiment are stored. If not |
| | provided, the remote server will |
| | select an appropriate default. |
+-------------------------------+--------------------------------------+

``mlflow_set_experiment_tag``
=============================

Expand All @@ -1884,7 +1857,7 @@ metadata that can be updated.

mlflow_set_experiment_tag(key, value, experiment_id = NULL, client = NULL)

.. _arguments-44:
.. _arguments-43:

Arguments
---------
Expand Down Expand Up @@ -1916,6 +1889,42 @@ Arguments
| | the current tracking URI. |
+-------------------------------+--------------------------------------+

``mlflow_set_experiment``
=========================

Set Experiment

Sets an experiment as the active experiment. Either the name or ID of
the experiment can be provided. If the a name is provided but the
experiment does not exist, this function creates an experiment with
provided name. Returns the ID of the active experiment.

.. code:: r

mlflow_set_experiment(
experiment_name = NULL,
experiment_id = NULL,
artifact_location = NULL
)

.. _arguments-44:

Arguments
---------

+-------------------------------+--------------------------------------+
| Argument | Description |
+===============================+======================================+
| ``experiment_name`` | Name of experiment to be activated. |
+-------------------------------+--------------------------------------+
| ``experiment_id`` | ID of experiment to be activated. |
+-------------------------------+--------------------------------------+
| ``artifact_location`` | Location where all artifacts for |
| | this experiment are stored. If not |
| | provided, the remote server will |
| | select an appropriate default. |
+-------------------------------+--------------------------------------+

``mlflow_set_tag``
==================

Expand Down
4 changes: 3 additions & 1 deletion mlflow/R/mlflow/.Rbuildignore
Expand Up @@ -20,6 +20,8 @@ Reference_Manual_mlflow.md
^depends\.Rds
^R-version
^\.utils\.R$
^\.install-deps\.R$
^\.build-package\.R$
^\.build-doc\.R$
^build-package\.sh$
^Dockerfile\.build$
^Dockerfile\.dev$
14 changes: 14 additions & 0 deletions mlflow/R/mlflow/.build-doc.R
@@ -0,0 +1,14 @@
# Install Rd2md from source as a temporary fix for the rendering of code examples, until
# a release is published including the fixes in https://github.com/quantsch/Rd2md/issues/1
# Note that this commit is equivalent to commit 6b48255 of Rd2md master
# (https://github.com/quantsch/Rd2md/tree/6b4825579a2df8a22898316d93729384f92a756b)
# with a single extra commit to fix rendering of \link tags between methods in R documentation.
devtools::install_git("https://github.com/smurching/Rd2md", ref = "mlflow-patches")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we use commit hash instead of branch which may be changed ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, but we usually don't update this branch.

install.packages("rmarkdown", repos = "https://cloud.r-project.org")
unlink("man", recursive = TRUE)
roxygen2::roxygenise()
# remove mlflow-package doc temporarily because no rst doc should be generated for it.
file.remove("man/mlflow-package.Rd")
source("document.R", echo = TRUE)
# roxygenize again to make sure the previously removed mlflow-packge doc is available as R helpfile
roxygen2::roxygenise()
5 changes: 0 additions & 5 deletions mlflow/R/mlflow/.build-package.R
@@ -1,10 +1,5 @@
source(".utils.R")

# Increase the timeout length for `utils::download.file` because the default value (60 seconds)
# could be too short to download large packages such as h2o.
options(timeout=300)
# Install dependencies required for the submission check.
devtools::install_deps(".", dependencies = TRUE)
# Bundle up the package into a .tar.gz file. This file will be submitted to CRAN.
package_path <- devtools::build(".", path = ".")
# Run the submission check against the built package.
Expand Down
5 changes: 5 additions & 0 deletions mlflow/R/mlflow/.install-deps.R
@@ -0,0 +1,5 @@
# Increase the timeout length for `utils::download.file` because the default value (60 seconds)
# could be too short to download large packages such as h2o.
options(timeout=300)
install.packages("devtools", dependencies = TRUE)
devtools::install_dev_deps(dependencies = TRUE)
5 changes: 0 additions & 5 deletions mlflow/R/mlflow/Dockerfile.build

This file was deleted.

13 changes: 13 additions & 0 deletions mlflow/R/mlflow/Dockerfile.dev
@@ -0,0 +1,13 @@
FROM rocker/r-ver:4.1.2

WORKDIR /mlflow/mlflow/R/mlflow
RUN apt-get update -y
RUN apt-get install git wget libxml2-dev libgit2-dev -y
# pandoc installed by `apt-get` is too old and contains a bug.
RUN TEMP_DEB=$(mktemp) && \
wget --directory-prefix $TEMP_DEB https://github.com/jgm/pandoc/releases/download/2.16.2/pandoc-2.16.2-1-amd64.deb && \
dpkg --install $(find $TEMP_DEB -name '*.deb') && \
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we install pandoc by RUN apt-get install pandoc -y ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The version installed by pandoc is 2.5 which is too old and contains a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inserted a comment.

rm -rf $TEMP_DEB
COPY DESCRIPTION .
COPY .install-deps.R .
RUN Rscript -e 'source(".install-deps.R", echo = TRUE)'
5 changes: 3 additions & 2 deletions mlflow/R/mlflow/build-package.sh
@@ -1,5 +1,6 @@
#!/usr/bin/env bash
set -ex

docker build -f Dockerfile.build -t r-build-package .
docker run --rm --workdir /app -v $(pwd):/app r-build-package Rscript -e 'source(".build-package.R", echo = TRUE)'
image_name="mlflow-r-dev"
docker build -f Dockerfile.dev -t $image_name .
docker run --rm -v $(pwd):/mlflow/mlflow/R/mlflow $image_name Rscript -e 'source(".build-package.R", echo = TRUE)'