MyCollect

Mycollect monitor social networks to find the information the user is looking for.

Getting started

Requirements:

python >= 3.8
pipenv

pipenv install will create and install all required modules.

Configuration

Rename sample_config.yaml to config.yaml

Update the config.yaml with your twitter api credentials.

The email sender requires an AWS account, and will use the SES service, which allows 62k emails per month (should be enough).

Start the collect

pipenv run python -m mycollect.starter

Configuring the collect

You can configure the twitter collect:

languages: list of languages of the tweets
low_priority_url: prioritize URLs in tweets which have a different hostname from the list
track: list of terms you want to follow

Configuring the storage

Currently only file storage is implemented. You can specify the folder where the files will be stored:

storage:
  type: mycollect.storage.file_storage.FileStorage
  args:
    folder: STORAGE_FOLDER

Configuring aggregators

Currently there is only one aggregator: DummyAggregator, that will group elements per URL and category, and electing the top x per category

aggregators:
  - name: dummy aggregator
    type: mycollect.aggregators.dummy_aggregator.DummyAggregator
    schedule: 30 18 * * *
    notify: daily_report
    args:
      top_articles: 3

The schedule parameter is used to trigger the aggregator. The notify property is used to trigger the right output

Configuring output

Email

recipients: the list of emails you want to send an email to
sender: the email address that is used as the sender
templates: jinja2 templates of the email body

A template should have a name and a template, the name should match one of the notify property in aggregators.

Annexes

Twitter api credentials

If you don't have a twitter developer account: link

Create a twitter application: link

From the Keys and tokens tab, you will find the credentials:

Consumer API keys
- api key
- api secret key
Access token & access token secret
- access token
- access token secret

AWS account

I suggest you to read the AWS documentation, which contains:

creation of the account
configuring the service

Once done, get:

aws access key
aws secret key
aws region of the SES service (it might depend on your location)

Name		Name	Last commit message	Last commit date
Latest commit History 385 Commits
.github/workflows		.github/workflows
mycollect		mycollect
tests		tests
.gitignore		.gitignore
CLOUD_PROVIDERS.md		CLOUD_PROVIDERS.md
Dockerfile		Dockerfile
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md
docker-compose.yml		docker-compose.yml
pytest.ini		pytest.ini
sample_config.yaml		sample_config.yaml

mathrb/mycollect

Folders and files

Latest commit

History

Repository files navigation

MyCollect

Getting started

Configuration

Start the collect

Configuring the collect

Configuring the storage

Configuring aggregators

Configuring output

Email

Annexes

Twitter api credentials

AWS account

About

Resources

Stars

Watchers

Forks

Languages