To scrape apartment data from websites
To warehouse apartment data for analysis
Each of the directories in this repository is an independent data scraping module. Each independent module structure is as follows in this example:
- vita
- vita.py <- scraper logic
- requirements.txt <- contains python libraries necessary for execution
- Dockerfile <- required for deployment to cluster for automated ingestion
- Makefile <- [optional] for Mac users only, but pretty helpful
- Copy the existing
vita
directory with a one-word name for the new apt complex you wish to add
cd data-ingestion
mkdir monkeys
cp -r vita/ monkeys
- Rename the python file
cd data-ingestion
mv monkeys/di-vita.py monkeys/di-monkeys.py
- Rename line
5
of the Dockerfile fromvita
tomonkeys
. - Rename line
11
of the Dockerfile fromvita
tomonkeys
. - Rename line
2
of the Makefile (APPNAME) fromdi-vita
todi-monkeys
- Rename line
1
of the Makefile (DOCKERUSERNAME) to your own Docker repo username - Update
monkeys.py
to scrape the required data