Skip to content

dragonly/scrapy_tianya

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

scrapy_tianya

A crawler for bbs.tianya.cn, using scrapy as crawler framework

Prerequisite:

  • scrapy
  • mongodb
  • pymongo

PS: Actually you can rewrite tianya/pipelines.py to change the storage backend, instead of mongodb :)

PPS: xpath links are easy to get in Chrome Developer Tool

Instruction

cd path/to/repo
mkdir job
scrapy crawl tianyaSpider -s JOBDIR=/path/to/job/job-1_or_whatever

About

A crawler for bbs.tianya.cn, using scrapy as crawler framework

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages