nyc-driving-income-impact

This was a Big Data project to find links between income and people who walk or bike to work and are victims of traffic crashes in New York City. See paper.pdf in this repo for a detailed description and results.

We used three public data sources: https://data.cityofnewyork.us/Public-Safety/NYPD-Motor-Vehicle-Collisions/h9gi-nx95 https://nycopendata.socrata.com/Social-Services/311-Service-Requests-from-2010-to-Present/erm2-nwe9 https://www2.census.gov/programs-surveys/acs/data/pums/2015/5-Year/csv_pny.zip

Java code for cleaning this data using MapReduce is found in src/mapreduce/

Other code in src is Python code for cleaning data or Hive scripts for doing analytics.

ST_Geometry_UDF.q puma_to_hive.py and split_latlong.q: This was an attempt to use a UDF provided by ESRI to put lat/long keys in the crash and complaint data inside the PUMA shapes. We weren't able to do this successfully, contributions would be helpful! Our workaround was instead using a zip code to PUMA crosswalk.

analytics.py: This file uses PySpark to perform statistical analysis and facilitate exploration of the data. This script first reads a hive table and then performs transformations on it to calculate some statistics of the data. Along with displaying it in a format so that it makes analysis easier.

analytics.q: This Hive script does some simple analysis on the analytic table like sorting, finding the mean of the three data, and finding the community district where has high commuters/crashes/complaints.

create_data_tables.q: This file has a set of queries which create the database and tables for analytics.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src		src
README.md		README.md
paper.pdf		paper.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

src

src

README.md

README.md

paper.pdf

paper.pdf

Repository files navigation

nyc-driving-income-impact

About

Releases

Packages

Languages

scottwn/nyc-driving-income-impact

Folders and files

Latest commit

History

Repository files navigation

nyc-driving-income-impact

About

Topics

Resources

Stars

Watchers

Forks

Languages