Skip to content

Latest commit

 

History

History
56 lines (40 loc) · 2.24 KB

CONTRIBUTING.md

File metadata and controls

56 lines (40 loc) · 2.24 KB

Contributing to Mosaic

Overview

We happily welcome contributions to Mosaic. We use GitHub Issues to track community reported issues and GitHub Pull Requests for accepting changes.

Repository structure

The repository is structured as follows:

  • pom.xml Mosaic project definition and dependencies
  • src/ Scala source code and tests for Mosaic
  • python/ Source code for Python bindings
  • docs/ Source code for documentation
  • .github/workflows CI definitions for Github Actions

Test & build Mosaic

Scala JAR

We use the Maven build tool to manage and build the Mosaic scala project.

The Mosaic JAR including all dependencies can be generated by running: mvn clean package. By default, this will also run the tests in src/test/.

The packaged JAR should be available in target/.

Python bindings

The python bindings can be tested using unittest.

  • Build the scala project and copy to the packaged JAR to the python/mosaic/lib/ directory.
  • Move to the python/ directory and install the project and its dependencies: pip install . && pip install pyspark==<project_spark_version> (where 'project_spark_version' corresponds to the version of Spark used for the target Databricks Runtime, e.g. 3.2.1.
  • Run the tests using unittest: python -m unittest

The project wheel file can be built with build.

  • Install the build requirements: pip install build wheel.
  • Build the wheel using python -m build.
  • Collect the .whl file from python/dist/

Documentation

The documentation has been produced using Sphinx.

To build the docs:

  • Install the pandoc library (follow the instructions for your platform here).
  • Install the python requirements from docs/docs-requirements.txt.
  • Build the HTML documentation by running make html from docs/.
  • You can locally host the docs by running the reload.py script in the docs/source/ directory.

Style

Tools we use for code formatting and checking:

  • scalafmt and scalastyle in the main scala project.
  • black and isort for the python bindings.