Skip to content

MuckRock/documentcloud

Repository files navigation

DocumentCloud

DocumentCloud · Squarelet · MuckRock · DocumentCloud-Frontend

Analyze, Annotate, Publish. Turn documents into data.

Prerequisites

You must first have these set up and ready to go:

  • Squarelet. DocumentCloud depends on Squarelet for user authentication. As the services need to communicate directly, the development environment for DocumentCloud depends on the development environment for Squarelet - the DocumentCloud docker containers will join Squarelet's docker network. Please install Squarelet and set up its development environment first.
  • DocumentCloud frontend

*Note the front end will not be functional until you complete the current install.

Install

Software required

  1. docker
  2. python
  3. invoke
  4. git

Installation of DocumentCloud and its Authentication System

  1. Install software above and Git Large File support using these instructions.
    • Ensure you have at least an additional 11 gigabytes of hard disk space allocated to Docker for these purposes.
    • Ensure your Docker host application has at least 7gb of memory allocated, 10gb preferred.
    • These instructions create 3 distinct docker compose sessions, with the Squarelet session hosting the shared central network.
  2. Check out the git repository - git clone git@github.com:MuckRock/documentcloud.git
  3. Enter the directory - cd documentcloud
  4. Run the dotenv initialization script - python initialize_dotenvs.py This will create files with the environment variables needed to run the development environment.
  5. Set api.dev.documentcloud.org and minio.documentcloud.org to point to localhost - echo "127.0.0.1 api.dev.documentcloud.org minio.documentcloud.org" | sudo tee -a /etc/hosts
  6. Run export COMPOSE_FILE=local.yml; in any of your command line sessions so that docker compose finds the configuration.
  7. Run docker compose up.
  8. Enter api.dev.documentcloud.org/ into your browser - you should see the Django API root page. Note that api is before dev in this service URL.
  9. In .envs/.local/.django set the following environment variables:
  • SQUARELET_KEY to the value of Client ID from the Squarelet Client
  • SQUARELET_SECRET to the value of Client SECRET from the Squarelet Client
  • Additionally, get the value for JWT_VERIFYING_KEY by opening the Squarelet Django shell using inv shell and copying the settings.SIMPLE_JWT['VERIFYING_KEY'] (remove the leading b' and the trailing ', leave the \n portions as-is)
  1. You must restart the Docker Compose session (via the command docker compose down followed by docker compose up) each time you change a .django file for it to take effect.
  2. Log in using the Squarelet superuser on the locally-running Documentcloud-frontend that you installed earlier at https://dev.documentcloud.org
    • SQUARELET_WHITELIST_VERIFIED_JOURNALISTS=True environment variable makes it so only verified journalists can log into DocumentCloud.
    • Use the squarelet admin Organization page to mark your organization as a verified journalist to allow upload to DocumentCloud.
    • Make your Squarelet superuser also a superuser on DocumentCloud Django: Run inv shell in the DocumentCloud folder and use these commands (no indent):
      tempUser = User.objects.all()[0]
      tempUser.is_superuser = True
      tempUser.save()
      tempUser.is_staff = True
      tempUser.save()
      
  3. Go to Django admin for DocumentCloud and add the required static flat page called /tipofday/. It can be blank. Do not prefix the URL with /pages/. Specifying the Site as example.com is alright.
  4. Create an initial Minio bucket to simulate AWS S3 locally:
    • Reference your DocumentCloud .django file for these variables:
    • Visit the MINIO_URL with a browser, likely at this address, and login with the minio MINIO_ACCESS_KEY and MINIO_SECRET_KEY
    • At the bottom right corner click the round plus button and then click the first circle that appears above it to "create bucket".
    • Create a bucket called documents
  5. Upload a document:
    • Check your memory allocation on Docker is at least 7gb. A sign that you do not have enough memory allocated is if containers are randomly failing or if your system is swapping heavily, especially when uploading documents.
    • The "upload" button should not be grayed out (if it is, check your user organization Verified Journalist status above)
    • If you get an error on your console about signatures, fix minio as above.
    • If you get an error on your console about tipofday not found, add the static page as above.
  6. Develop DocumentCloud and its frontend!
  7. You can run the tests with inv test.
  • If you want to run a subset of the tests, you can specify the directory containing the test you want with the path switch like so: inv test --path documentcloud/documents.
    • You can specify a single file in --path if you only want to run the tests in that file.

About

DocumentCloud's back end source code - Please report bugs, issues and feature requests to info@documentcloud.org

Resources

License

Stars

Watchers

Forks

Packages

No packages published