Skip to content
This repository has been archived by the owner on Nov 5, 2019. It is now read-only.
/ coverage Public archive

Project for visualizing the status of digital data archiving efforts across various data repositories

License

Notifications You must be signed in to change notification settings

datatogether/coverage

Repository files navigation

Coverage

GitHub Slack License Codecov CircleCI

Coverage is a project for visualizing the status of digital data archiving efforts across various data repositories run by different initiatives. Its current scope covers data within the epa.gov top-level domain.

This code repo provides the JSON back-end: https://api.archivers.co/coverage

The datatogether/webapp repo provides the visual front-end: https://archivers.co/coverage

License & Copyright

Copyright (C) 2017 Data Together

This program is free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, version 3.0.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

See the LICENSE file for details.

Current Data Repositories

Actual source datasets can be found in each /repositories/* directory. It currently includes the following:

Requests for new data repositories are tracked under the data-repository issue label.

How It Works

It takes a list of urls and associated archiving information, and turns that into a tree of url paths with associated coverage information.

The output is cached in cache.json. Because this is a large file, we provide incremental pieces of the cached tree as a web server. To dynamically calculate coverage completion to can work with the cache.json file.

Routes

  • /healthcheck - server status
  • /repositories - list all data repositories
  • /repositories/:repository_uuid - get details for a single data repository
  • /fulltree - get full coverage tree of url-based resources
  • /tree - get scope-able coverage tree
  • /coverage - get coverage summary (not currently used)

Getting Involved

We would love involvement from more people! If you notice any errors or would like to submit changes, please see our Contributing Guidelines.

We use GitHub issues for tracking bugs and feature requests and Pull Requests (PRs) for submitting changes

Installation

Running this project can be done either directly on your workstation system, or in a "container" via Docker.

For people comfortable with Docker, or who are excited to learn about it, it can be the best way to get going.

Docker Install

Running this project via Docker requires:

Running the project in a Docker container should be as simple as:

make setup
make run

If you get an error about a port "address already in use", you can change the PORT environment variable in your local .env file.

Barring any changes, you may now visit a JSON endpoint at: http://localhost:8080/repositories

Local System Install

Running this project directly on your system requires:

  • Go 1.7+
  • Postgresql

(Setting up these services is outside the scope of this README.)

cd path/to/coverage
createdb datatogether_coverage
go build
go get ./
# Set a free port on which to serve JSON
export PORT=8080
# Your postgresql instance may be running on a different port
export POSTGRES_DB_URL=postgres://localhost:5432/datatogether_coverage
$GOPATH/bin/coverage

Barring any changes, you may now visit a JSON endpoint at: http://localhost:8080/repositories

Development

Please follow the install instructions above! Inclusion of tests are appreciated!

For a list of all availabe helper commands, just type make.

About

Project for visualizing the status of digital data archiving efforts across various data repositories

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published