Twitter Scraper

Twitter Scraper is a simple scraper tool based on playwright. It can scrap multiple tweets and generate for each a beautiful screenshot and a pdf.

Examples:

and

Installation

Before all, you need NodeJs on your computer. Download it on https://nodejs.org/

Clone the project then execute :

npm i

Configuration

Edit the twitter-links.js file and add all the tweet urls you want to scrap.

const urls = [
  "https://twitter.com/DalaiLama/status/1559111653450727425",
  "https://twitter.com/DalaiLama/status/2",
  "https://twitter.com/DalaiLama/status/3",
  "https://twitter.com/DalaiLama/status/4"
];

Execution

npm run scrap

Pdf and Screenshots will be exported in the /tweets directory

Configuration

You can configure the scraper in the index.js file :

Change export directory:

const exportDir = "tweets";

Enable/Disable screenshot or pdf export :

const exportScreenshot = true;
const exportPdf = true;

Increase or decrease scraper timeout :

const timeout = 30; // in seconds

Increase or decrease screenshot/pdf number of page :

const pageCount = 40; // number of page (page height 1024px)

Knwown problems

Timeout

On some big tweets or slow computers, you may encounter a timeout.

In that case, you could increase the timeout

Replacing

const timeout = 30; // in seconds

by (or more)

const timeout = 60; // in seconds

Todo

Pass configuration to index.js by arguments
Best timeout handle

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
tweets		tweets
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
twitter-links.js		twitter-links.js

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

tweets

tweets

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

index.js

index.js

package-lock.json

package-lock.json

package.json

package.json

twitter-links.js

twitter-links.js

Repository files navigation

Twitter Scraper

Installation

Configuration

Execution

Configuration

Knwown problems

Timeout

Todo

About

Releases

Packages

Languages

License

aprovent/twitter-scraper

Folders and files

Latest commit

History

Repository files navigation

Twitter Scraper

Installation

Configuration

Execution

Configuration

Knwown problems

Timeout

Todo

About

Resources

License

Stars

Watchers

Forks

Languages