Skip to content

Latest commit

 

History

History
27 lines (25 loc) · 716 Bytes

README.md

File metadata and controls

27 lines (25 loc) · 716 Bytes

private-building-web-scraper

About

Scraper of https://bmis1.buildingmgt.gov.hk/bd_hadbiex/content/searchbuilding/building_search.jsf?renderedValue=true

Dataset Available At

http://data.g0vhk.io/dataset/private-buildings-in-hong-kong-dataset

Commands

Setup

python3 -menv ./env
source ./env/bin/activate
pip install -r requirements.txt

Running

Scraping Building Search Data

scrapy crawl building_search -t json -o output.json

Scraping Address Geocoding Data

scrapy crawl address_geo -t json -a input_file=output.json -o address.json

Combining Geocoding and Building Data

python combine.py -i output.json -a address.json -o result.csv