From 8615756e4fb2ce879a083f262ff04968ad5a0241 Mon Sep 17 00:00:00 2001 From: ejm714 Date: Wed, 23 Mar 2022 00:00:45 +0000 Subject: [PATCH] Add updated files from build --- README.md | 11 +++-------- docs/docs/examples.md | 2 +- docs/docs/index.md | 11 +++-------- examples/ethics.html | 2 +- examples/ethics.ipynb | 2 +- examples/ethics.md | 2 +- examples/ethics.rst | 2 +- examples/ethics.txt | 2 +- 8 files changed, 12 insertions(+), 22 deletions(-) diff --git a/README.md b/README.md index 391eeaa..5df2476 100644 --- a/README.md +++ b/README.md @@ -140,28 +140,23 @@ Usage: deon [OPTIONS] Easily create an ethics checklist for your data science project. - The checklist will be printed to standard output by default. Use the - --output option to write to a file instead. + The checklist will be printed to standard output by default. Use the --output + option to write to a file instead. Options: -l, --checklist PATH Override default checklist file with a path to a custom checklist.yml file. - -f, --format TEXT Output format. Default is "markdown". Can be one of [ascii, html, jupyter, markdown, rmarkdown, rst]. Ignored and file extension used if --output is passed. - -o, --output PATH Output file path. Extension can be one of [.txt, .html, .ipynb, .md, .rmd, .rst]. The checklist is appended if the file exists. - -w, --overwrite Overwrite output file if it exists. Default is False, which will append to existing file. - -m, --multicell For use with Jupyter format only. Write checklist with multiple cells, one item per cell. Default is False, which will write the checklist in a single cell. - --help Show this message and exit. ``` @@ -175,7 +170,7 @@ Options: [![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/) ## A. Data Collection - - [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? + - [ ] **A.1 TEST Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? - [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? - [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? - [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? diff --git a/docs/docs/examples.md b/docs/docs/examples.md index bf19c52..7ec3e85 100644 --- a/docs/docs/examples.md +++ b/docs/docs/examples.md @@ -7,7 +7,7 @@ To make the ideas contained in the checklist more concrete, we've compiled examp
Checklist Question
|
Examples of Ethical Issues
--- | --- |
**Data Collection**
-**A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? | +**A.1 TEST Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? | **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? | **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? | **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? | diff --git a/docs/docs/index.md b/docs/docs/index.md index b0a0baf..503defe 100644 --- a/docs/docs/index.md +++ b/docs/docs/index.md @@ -133,28 +133,23 @@ Usage: deon [OPTIONS] Easily create an ethics checklist for your data science project. - The checklist will be printed to standard output by default. Use the - --output option to write to a file instead. + The checklist will be printed to standard output by default. Use the --output + option to write to a file instead. Options: -l, --checklist PATH Override default checklist file with a path to a custom checklist.yml file. - -f, --format TEXT Output format. Default is "markdown". Can be one of [ascii, html, jupyter, markdown, rmarkdown, rst]. Ignored and file extension used if --output is passed. - -o, --output PATH Output file path. Extension can be one of [.txt, .html, .ipynb, .md, .rmd, .rst]. The checklist is appended if the file exists. - -w, --overwrite Overwrite output file if it exists. Default is False, which will append to existing file. - -m, --multicell For use with Jupyter format only. Write checklist with multiple cells, one item per cell. Default is False, which will write the checklist in a single cell. - --help Show this message and exit. ``` @@ -168,7 +163,7 @@ Options: [![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/) ## A. Data Collection - - [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? + - [ ] **A.1 TEST Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? - [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? - [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? - [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? diff --git a/examples/ethics.html b/examples/ethics.html index 2ed83bd..111198b 100644 --- a/examples/ethics.html +++ b/examples/ethics.html @@ -18,7 +18,7 @@

  • - A.1 Informed consent: + A.1 TEST Informed consent: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?
  • diff --git a/examples/ethics.ipynb b/examples/ethics.ipynb index 2d5dc17..d4f9a2b 100644 --- a/examples/ethics.ipynb +++ b/examples/ethics.ipynb @@ -1 +1 @@ -{"nbformat": 4, "nbformat_minor": 2, "metadata": {}, "cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Data Science Ethics Checklist\n", "\n", "[![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/)\n", "\n", "## A. Data Collection\n", " - [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?\n", " - [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?\n", " - [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?\n", " - [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?\n", "\n", "## B. Data Storage\n", " - [ ] **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)?\n", " - [ ] **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed?\n", " - [ ] **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed?\n", "\n", "## C. Analysis\n", " - [ ] **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)?\n", " - [ ] **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)?\n", " - [ ] **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data?\n", " - [ ] **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis?\n", " - [ ] **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future?\n", "\n", "## D. Modeling\n", " - [ ] **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory?\n", " - [ ] **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)?\n", " - [ ] **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics?\n", " - [ ] **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed?\n", " - [ ] **D.5 Communicate bias**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood?\n", "\n", "## E. Deployment\n", " - [ ] **E.1 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)?\n", " - [ ] **E.2 Roll back**: Is there a way to turn off or roll back the model in production if necessary?\n", " - [ ] **E.3 Concept drift**: Do we test and monitor for concept drift to ensure the model remains fair over time?\n", " - [ ] **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed?\n", "\n", "*Data Science Ethics Checklist generated with [deon](http://deon.drivendata.org).*\n"]}]} \ No newline at end of file +{"nbformat": 4, "nbformat_minor": 2, "metadata": {}, "cells": [{"cell_type": "markdown", "metadata": {}, "source": ["# Data Science Ethics Checklist\n", "\n", "[![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/)\n", "\n", "## A. Data Collection\n", " - [ ] **A.1 TEST Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent?\n", " - [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those?\n", " - [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis?\n", " - [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?\n", "\n", "## B. Data Storage\n", " - [ ] **B.1 Data security**: Do we have a plan to protect and secure data (e.g., encryption at rest and in transit, access controls on internal users and third parties, access logs, and up-to-date software)?\n", " - [ ] **B.2 Right to be forgotten**: Do we have a mechanism through which an individual can request their personal information be removed?\n", " - [ ] **B.3 Data retention plan**: Is there a schedule or plan to delete the data after it is no longer needed?\n", "\n", "## C. Analysis\n", " - [ ] **C.1 Missing perspectives**: Have we sought to address blindspots in the analysis through engagement with relevant stakeholders (e.g., checking assumptions and discussing implications with affected communities and subject matter experts)?\n", " - [ ] **C.2 Dataset bias**: Have we examined the data for possible sources of bias and taken steps to mitigate or address these biases (e.g., stereotype perpetuation, confirmation bias, imbalanced classes, or omitted confounding variables)?\n", " - [ ] **C.3 Honest representation**: Are our visualizations, summary statistics, and reports designed to honestly represent the underlying data?\n", " - [ ] **C.4 Privacy in analysis**: Have we ensured that data with PII are not used or displayed unless necessary for the analysis?\n", " - [ ] **C.5 Auditability**: Is the process of generating the analysis well documented and reproducible if we discover issues in the future?\n", "\n", "## D. Modeling\n", " - [ ] **D.1 Proxy discrimination**: Have we ensured that the model does not rely on variables or proxies for variables that are unfairly discriminatory?\n", " - [ ] **D.2 Fairness across groups**: Have we tested model results for fairness with respect to different affected groups (e.g., tested for disparate error rates)?\n", " - [ ] **D.3 Metric selection**: Have we considered the effects of optimizing for our defined metrics and considered additional metrics?\n", " - [ ] **D.4 Explainability**: Can we explain in understandable terms a decision the model made in cases where a justification is needed?\n", " - [ ] **D.5 Communicate bias**: Have we communicated the shortcomings, limitations, and biases of the model to relevant stakeholders in ways that can be generally understood?\n", "\n", "## E. Deployment\n", " - [ ] **E.1 Redress**: Have we discussed with our organization a plan for response if users are harmed by the results (e.g., how does the data science team evaluate these cases and update analysis and models to prevent future harm)?\n", " - [ ] **E.2 Roll back**: Is there a way to turn off or roll back the model in production if necessary?\n", " - [ ] **E.3 Concept drift**: Do we test and monitor for concept drift to ensure the model remains fair over time?\n", " - [ ] **E.4 Unintended use**: Have we taken steps to identify and prevent unintended uses and abuse of the model and do we have a plan to monitor these once the model is deployed?\n", "\n", "*Data Science Ethics Checklist generated with [deon](http://deon.drivendata.org).*\n"]}]} \ No newline at end of file diff --git a/examples/ethics.md b/examples/ethics.md index 4227504..e6fa3b8 100644 --- a/examples/ethics.md +++ b/examples/ethics.md @@ -3,7 +3,7 @@ [![Deon badge](https://img.shields.io/badge/ethics%20checklist-deon-brightgreen.svg?style=popout-square)](http://deon.drivendata.org/) ## A. Data Collection - - [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? + - [ ] **A.1 TEST Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? - [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? - [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? - [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? diff --git a/examples/ethics.rst b/examples/ethics.rst index df92eec..d8c7c52 100644 --- a/examples/ethics.rst +++ b/examples/ethics.rst @@ -7,7 +7,7 @@ Data Science Ethics Checklist A. Data Collection --------- -* [ ] **A.1 Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? +* [ ] **A.1 TEST Informed consent**: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? * [ ] **A.2 Collection bias**: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? * [ ] **A.3 Limit PII exposure**: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? * [ ] **A.4 Downstream bias mitigation**: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)? diff --git a/examples/ethics.txt b/examples/ethics.txt index 7d4344e..e1991a5 100644 --- a/examples/ethics.txt +++ b/examples/ethics.txt @@ -1,7 +1,7 @@ Data Science Ethics Checklist A. Data Collection -* A.1 Informed consent: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? +* A.1 TEST Informed consent: If there are human subjects, have they given informed consent, where subjects affirmatively opt-in and have a clear understanding of the data uses to which they consent? * A.2 Collection bias: Have we considered sources of bias that could be introduced during data collection and survey design and taken steps to mitigate those? * A.3 Limit PII exposure: Have we considered ways to minimize exposure of personally identifiable information (PII) for example through anonymization or not collecting information that isn't relevant for analysis? * A.4 Downstream bias mitigation: Have we considered ways to enable testing downstream results for biased outcomes (e.g., collecting data on protected group status like race or gender)?