Skip to content

4rgon4ut/hh_scraping

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

hh_scraping

my crawler for top russian recruiting website https://hh.ru/

Crawler based on scrapy framework https://scrapy.org/

Added bash script for easy using.

You just need :

$sh start.sh

(from project directory)

In scrapy parsers calling 'spiders' , so spider is HH_crawler/spiders/HH_spider.py

This spider crawl pages and scraping data from every vacancy to file by command :

$scrapy crawl hh_spider -o filename.csv

Selectors in spider tuned to scrape main skill tags and salary from every vacancy

Then scraped data calculating and analysing in HH_crawler/statistics.py :

$python3 statistics.py

statistics.py using pandas and some built-in python libs

About

my crawler for top russian recruiting website

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published