Skip to content

Repository for the joint CDC+USDS PRIME project's Data Ingestion prototype project. The goal is to work with raw vaccine, case, and lab report data to arrive at an analysis of breakthrough cases in the state of Virginia that uses a data lake as underlying storage.

License

Notifications You must be signed in to change notification settings

CDCgov/prime-public-health-data-infrastructure

Repository files navigation

PRIME Public Health Data Infrastructure

General disclaimer This repository was created for use by CDC programs to collaborate on public health related projects in support of the CDC mission. GitHub is not hosted by the CDC, but is a third party website used by CDC and its partners to share information and collaborate on software. CDC use of GitHub does not imply an endorsement of any one particular service, product, or enterprise.

Overview

The PRIME Public Health Data Infrastructure projects are part of the Pandemic-Ready Interoperability Modernization Effort, a multi-year collaboration between CDC and the U.S. Digital Service (USDS) to strengthen data quality and information technology systems in state and local health departments.

This repository represents the source code for three related workstreams in this area:

  • Data Storage, Tooling, and Preparation (DSTP) - Assist STLTs in implementing modern infrastructure that is flexible enough to allow jurisdictions to prepare for as-of-yet unknown use cases, while providing a long-term storage mechanism and query solution for effective daily use and timely response.
  • Common Data Model - Define USCDI+ for Public Health and implementation guidance that specifies common data elements, value sets, and APIs to enable interoperability, data cleaning, and data linkage across the ecosystem of public health data senders and receivers
  • Workbench - Develop a platform for high quality tools, services, analytical products, and reporting that addresses identified pain points for data receivers/public health departments, and incentivizes adoption of the Common Data Model

All three projects will begin with a time-limited prototype, focused on working with raw ELR, eCR, and VXU data from the commonwealth of Virginia Department of Health, for the purpose of experimenting with a unified data lake paradigm. Over time they will evolve to tackle new STLTs and new challenges around data quality and standardization.

The PRIME Public Health Data Infrastructure prototype a sibling project to PRIME ReportStream, focusing on delivering COVID-19 test data to public health departments, and PRIME SimpleReport, working on a better way to report COVID-19 rapid tests.

Problem Scope

Long-term Vision for all three workstreams: Public health systems to digest, analyze, and respond to data are siloed. Lacking access to actionable data, our national, as well as state, local, and territorial infrastructure, isn’t pandemic-ready. Our objective was to learn how the CDC can best support STLTs in moving towards a modern public health data infrastructure.

Vision for short-term prototype: The Commonwealth of Virginia Department of Public Health would like to move towards a more ‘holistic’ solution for electronic data processing and integration, so that currently siloed datasets can be processed and ultimately used in a more efficient and effective manner. Specifically, in order to support the vision of ‘holistic processing’, this prototype is focusing the on seeing what can be done with raw data, focused on COVID-19 breakthrough cases.

Target Users

Eventual target users of this system include:

  • Public Health Departments
    • Epidemiologists who rely on health data to take regular actions
    • Senior stakeholders who make executive decisions using aggregate health data
    • IT teams who have to support epidemiologists and external stakeholders integrating with the PHD
    • PHDs may include state, county, city, and tribal organizations

Technical

Getting Started

To start development against this project, please consult the Getting Started doc for more details on setting up local development environments, deployments, and more.

Decision Records

See A record of all decisions that have been made on the project and propose your own by going to the decisions directory and following instructions there.

Standard Notices

Public Domain Standard Notice

This repository constitutes a work of the United States Government and is not subject to domestic copyright protection under 17 USC § 105. This repository is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication. All contributions to this repository will be released under the CC0 dedication. By submitting a pull request you are agreeing to comply with this waiver of copyright interest.

License Standard Notice

This project is in the public domain within the United States, and copyright and related rights in the work worldwide are waived through the CC0 1.0 Universal public domain dedication. All contributions to this project will be released under the CC0 dedication. By submitting a pull request or issue, you are agreeing to comply with this waiver of copyright interest and acknowledge that you have no expectation of payment, unless pursuant to an existing contract or agreement.

Privacy Standard Notice

This repository contains only non-sensitive, publicly available data and information. All material and community participation is covered by the Disclaimer and Code of Conduct. For more information about CDC's privacy policy, please visit http://www.cdc.gov/other/privacy.html.

Contributing Standard Notice

Anyone is encouraged to contribute to the repository by forking and submitting a pull request. (If you are new to GitHub, you might start with a basic tutorial.) By contributing to this project, you grant a world-wide, royalty-free, perpetual, irrevocable, non-exclusive, transferable license to all users under the terms of the Apache Software License v2 or later.

All comments, messages, pull requests, and other submissions received through CDC including this GitHub page may be subject to applicable federal law, including but not limited to the Federal Records Act, and may be archived. Learn more at http://www.cdc.gov/other/privacy.html.

See CONTRIBUTING.md for more information.

Records Management Standard Notice

This repository is not a source of government records, but is a copy to increase collaboration and collaborative potential. All government records will be published through the CDC web site.

Related documents

Additional Standard Notices

Please refer to CDC's Template Repository for more information about contributing to this repository, public domain notices and disclaimers, and code of conduct.

About

Repository for the joint CDC+USDS PRIME project's Data Ingestion prototype project. The goal is to work with raw vaccine, case, and lab report data to arrive at an analysis of breakthrough cases in the state of Virginia that uses a data lake as underlying storage.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published