Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use ets instead of registry for store #29

Closed
wants to merge 2 commits into from

Conversation

happysalada
Copy link
Contributor

@happysalada happysalada commented Feb 12, 2020

Here is my reasoning.

  • using the store as a registry, means the process mailbox can become full and lead to a bottleneck.
  • The advantage of having a registry here is that it is distributed. Deploying the crawler in a distributed fashion raises additional questions, so I have opted out of that path. (instead of ets it would be possible to use mnesia, which is slower but distributed)
  • I added a store.reset method for periodic crawling tasks. (i.e. I want to check the content of the same pages everyday)
  • running some tests on my local, the next bottleneck comes from hackney
    checkout_timeout edgurgel/httpoison#359

My original issue was running into this actually.

In order to increase performance, the connection pool has to be managed in a different way. I'm currently researching this.

Let me know what you think

@coveralls
Copy link

coveralls commented Feb 12, 2020

Coverage Status

Coverage decreased (-0.5%) to 98.883% when pulling d3ed3f2 on happysalada:master into f2e0e93 on fredwu:master.

@happysalada
Copy link
Contributor Author

I'm going to go ahead and close this PR, as it doesn't fix the underlying issue.
Let me know in case you still want to merge it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants