Skip to content

Latest commit

 

History

History
19 lines (12 loc) · 337 Bytes

README.md

File metadata and controls

19 lines (12 loc) · 337 Bytes

Single domain web crawler

Crawl a single domain outputting the website's sitemap as a list of pages with each page's static assets and links to other pages.

Running it

  1. Install the dependencies from requirements.txt:
$ pip install -r requirements.txt
  1. And then run crawl.py:
$ python crawl.py