Skip to content

BeFAIR (Be Findable, Accessible, Interoperable, Reusable) Open Science Framework

License

Notifications You must be signed in to change notification settings

CoronaWhy/befair

Repository files navigation

BeFAIR

BeFAIR (Be Findable, Accessible, Interoperable, Reusable) Open Science Framework.

BeFAIR is a Common Distributed Research Infrastructure where users can add and run any tools and components by themselves using Debian's way of managing services. All selected services should be available on a selected subdomain name and could be easily integrated together with Dataverse, BeFAIR data repository.

BeFAIR was designed as out-of-the-box Distributed Networked Infrastructure that any research community can install with one command just as normal Operating System. The roadmap includes releases containing Open Data available for the different sciences, however COVID-19 Data Hub is our current priority.

Acknowledgements

BeFAIR infrastructure is standing the Shoulder of Giants. Please find below the acknowledgements for resources and contributions from the finished on ongoing projects.

Region Project Funding information Component
European Union CESSDA SaW H2020-INFRADEV-1-2015-1, grant agreement #674939 Dataverse as a service
European Union SSHOC H2020-INFRAEOSC-04-2018, grant agreement #823782 Cloud Dataverse
European Union EOSC Synergy INFRAEOSC-05-2018-2019, grant agreement No 857647 SQAaaS service
European Union EOSC-hub H2020-EINFRA-12-2017, grant agreement #777536 DataTags as a service
United States INDRA Defense Advanced Research Projects Agency under award W911NF-14-1-0397 INDRA service
European Union FAIRsFAIR H2020-INFRAEOSC-2018-2020 Grant agreement 831558 F-UJI and FAIR Data Points
Netherlands CLARIAH NWO grant number: 184.033.101 CLARIAH as a service
Finland SKOSMOS National Library of Finland SKOSMOS as a service

Available and planned services

Available basic infrastructure components:

  • traefik
  • postgresql
  • SOLR

The list of services integrated in BeFAIR:

To Do (we re accepting Pull Requests, please join the project if you want to contribute!):

  • Elasticsearch
  • SPARQL endpoint (Virtuoso as a service)
  • Grlc (SPARQL queries into RESTful APIs convertor)
  • Doccano
  • Jupyter
  • OCR Tesseract (OCR as a service)
  • Kibana

BeFAIR is using Traefik load balancer and proxy service. Please define traefikhost in the configuration of your distro (see distros/ folder) to start enabled services.

if you want to enable some service, for example, INDRA, run this from ./distros/your_distro_name where your_distro_name should correspond to your project name or domain (default is localhost):

ln -s ../../services-available/indra.yaml indra.yaml

For example, if you will put the following subdomain (labs.coronawhy.org) in .env file

traefikhost=labs.coronawhy.org

the services will be available on airflow.labs.coronawhy.org, superset.labs.coronawhy.org and so on.

Installation and deployment

You need Docker and Docker Compose before you'll be able to run BeFAIR:

sudo apt install make unzip docker-compose

add current user to group 'docker'

sudo adduser $USER docker

create new shell with new 'docker' group applied

newgrp docker

If you see the message: "ERROR: Network traefik declared as external, but could not be found", please create the network manually using docker network create traefik and try again.

After Docker is installed you can check the consistency of all BeFAIR distros:

git clone https://github.com/CoronaWhy/befair
cd befair
make check-all

You can find all available installations in distros/ folder. Different distros are suitable for various research communities.

Choose some distro, for example, "fair", and start/stop all services with commands:

cd distros/fair
make up
make down

Menuconfig

BeFAIR has a tool to manage all services, it's located in bin/befair. You can enable/disable both services and distros there in a convinient way.

Citation for the academic use

Please cite this work as follows:

Tykhonov V., Polishko A., Kiulian, A., Komar M. (2020). CoronaWhy: Building a Distributed, Credible and Scalable Research and Data Infrastructure for Open Science. Zenodo. http://doi.org/10.5281/zenodo.3922257

License

The content of this project itself is licensed under the Creative Commons Attribution 3.0 Unported license, and the underlying source code used to format and display that content is licensed under the MIT license.