Skip to content

Latest commit

 

History

History
188 lines (95 loc) · 10.6 KB

SOURCES.md

File metadata and controls

188 lines (95 loc) · 10.6 KB

Sources

Still incomplete. See #15.

7zip.png, ffox.png, gimp.png

Application icons from open-source software projects.

annual-precip.json

A raster grid of global annual precipitation for the year 2016 at a resolution 1 degree of lon/lat per cell, from CFSv2.

airports.csv

anscombe.json

Graphs in Statistical Analysis, F. J. Anscombe, The American Statistician.

barley.json

The result of a 1930s agricultural experiment in Minnesota, this dataset contains yields for 10 different varieties of barley at six different sites. It was first published by agronomists F.R. Immer, H.K. Hayes, and L. Powers in the 1934 paper "Statistical Determination of Barley Varietal Adaption." R.A. Fisher's popularized its use in the field of statistics when he included it in his book "The Design of Experiments." Since then it has been used to demonstrate new statistical techniques, including the trellis charts developed by Richard Becker, William Cleveland and others in the 1990s.

birdstrikes.csv

http://wildlife.faa.gov

budget.json

budgets.json

burtin.json

cars.json

http://lib.stat.cmu.edu/datasets/

climate.json

co2-concentration.csv

https://scrippsco2.ucsd.edu/data/atmospheric_co2/primary_mlo_co2_record but modified to only include date, CO2, seasonally adjusted CO2 and only include rows with valid data.

countries.json

crimea.json

disasters.csv

https://ourworldindata.org/natural-catastrophes

driving.json

https://archive.nytimes.com/www.nytimes.com/imagepages/2010/05/02/business/02metrics.html

earthquakes.json

https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/all_week.geojson (Feb 6, 2018)

flare.json

flights-?k.json, flights-200k.arrow, flights-airport.csv

Flight delay statistics from U.S. Bureau of Transportation Statistics, https://www.transtats.bts.gov/OT_Delay/OT_DelayCause1.asp.

Transformed using /scripts/flights.js. Arrow file generated with json2arrow.

football.json

Football match outcomes across multiple divisions from 2013 to 2017. This dataset is a subset of a larger dataset from https://github.com/openfootball/football.json. The subset was made such that there are records for all five chosen divisions over the time period.

gapminder-health-income.csv, gapminder.json

github.csv

Generated using /scripts/github.py.

global-temp.csv

Combined Land-Surface Air and Sea-Surface Water Temperature Anomalies (Land-Ocean Temperature Index, L-OTI), 1880-2023. Source: NASA's Goddard Institute for Space Studies https://data.giss.nasa.gov/gistemp/

population_engineers_hurricanes.csv

Data about engineers from https://www.bls.gov/oes/tables.htm. Hurricane data from http://www.nhc.noaa.gov/paststate.shtml. Income data from https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_07_3YR_S1901&prodType=table.

iowa-electricity.csv

The state of Iowa has dramatically increased its production of renewable wind power in recent years. This file contains the annual net generation of electricity in the state by source in thousand megawatthours. The dataset was compiled by the U.S. Energy Information Administration and downloaded on May 6, 2018. It is useful for illustrating stacked area charts.

jobs.json

la-riots.csv

More than 60 people lost their lives amid the looting and fires that ravaged Los Angeles for five days starting on April 29, 1992. This file contains metadata about each person, including the geographic coordinates of their death. It was compiled and published by the Los Angeles Times Data Desk.

londonBoroughs.json

Boundaries of London boroughs reprojected and simplified from London_Borough_Excluding_MHW shapefile held at https://data.london.gov.uk/dataset/statistical-gis-boundary-files-london. Original data "contains National Statistics data © Crown copyright and database right (2015)" and "Contains Ordnance Survey data © Crown copyright and database right [2015].

londonCentroids.json

Calculated from londongBoroughs.json using d3.geoCentroid.

londonTubeLines.json

Selected rail lines simplified from tfl_lines.json at https://github.com/oobrien/vis/tree/master/tube/data

miserables.json

monarchs.json

movies.json

The dataset has well known and intentionally included errors. This dataset is used for instructional purposes, including the need to reckon with dirty data.

ohlc.json

This dataset contains the performance of the Chicago Board Options Exchange Volatility Index (VIX) in the summer of 2009.

penguins.json

Palmer Archipelago (Antarctica) penguin data collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long Term Ecological Research Network. For more information visit allisonhorst/penguins on GitHub.

platformer-terrain.json

Assets from the video game Celeste.

political-contributions.json

population.json

seattle-weather.csv

Data from NOAA. Daily weather records with metric units. Transformed using /scripts/weather.py. We synthesized the categorical "weather" field from multiple fields in the original dataset. This data is intended for instructional purposes.

seattle-weather-hourly-normals.csv

Data from NOAA. Hourly weather normals with metric units. The 1981-2010 Climate Normals are NCDC's three-decade averages of climatological variables, including temperature and precipitation. Learn more in the documentation. We only included temperature, wind, and pressure and updated the format to be easier to parse.

sp500.csv

sp500-2000.csv

S&P 500 index values from 2000 to 2020, retrieved from Yahoo Finance.

stocks.csv

unemployment-across-industries.json

unemployment.tsv

us-10m.json

us-employment.csv

In the mid 2000s the global economy was hit by a crippling recession. One result: Massive job losses across the United States. The downturn in employment, and the slow recovery in hiring that followed, was tracked each month by the Current Employment Statistics program at the U.S. Bureau of Labor Statistics.

This file contains the monthly employment total in a variety of job categories from January 2006 through December 2015. The numbers are seasonally adjusted and reported in thousands. The data were downloaded on Nov. 11, 2018, and reformatted for use in this library.

Totals are included for the 22 "supersectors" tracked by the BLS. The "nonfarm" total is the category typically used by economists and journalists as a stand-in for the country's employment total.

A calculated "nonfarm_change" column has been appended with the month-to-month change in that supersector's employment. It is useful for illustrating how to make bar charts that report both negative and positive values.

volcano.json

Maunga Whau (Mt Eden) is one of about 50 volcanos in the Auckland volcanic field. This data set gives topographic information for Maunga Whau on a 10m by 10m grid. Digitized from a topographic map by Ross Ihaka, adapted from R datasets. These data should not be regarded as accurate.

weather.json

Instructional dataset showing actual and predicted temperature data.

weather.csv

Data from NOAA. Transformed using /scripts/weather.py. We synthesized the categorical "weather" field from multiple fields in the original dataset. This data is intended for instructional purposes.

wheat.json

In an 1822 letter to Parliament, William Playfair, a Scottish engineer who is often credited as the founder of statistical graphics, published an elegant chart on the price of wheat. It plots 250 years of prices alongside weekly wages and the reigning monarch. He intended to demonstrate that “never at any former period was wheat so cheap, in proportion to mechanical labour, as it is at the present time.”

windvectors.csv

Simulated wind patterns over northwestern Europe.

world-110m.json

zipcodes.csv

GeoNames.org