Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Retrain SBCS Models and some refactoring #99

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Commits on Dec 12, 2020

  1. Copy the full SHA
    5d91d93 View commit details
    Browse the repository at this point in the history
  2. Copy the full SHA
    c530b58 View commit details
    Browse the repository at this point in the history
  3. Copy the full SHA
    b72e0d3 View commit details
    Browse the repository at this point in the history
  4. Copy the full SHA
    36cdf8c View commit details
    Browse the repository at this point in the history
  5. Don't crawl more than 20k wiki articles per language in create_langua…

    …ge_model
    
    This is in case we encouter some really crazy article with millions of links,
    but it's also nice for debugging.
    dan-blanchard committed Dec 12, 2020
    Copy the full SHA
    51c13a2 View commit details
    Browse the repository at this point in the history
  6. Copy the full SHA
    d0f72fc View commit details
    Browse the repository at this point in the history
  7. Copy the full SHA
    ef79d4c View commit details
    Browse the repository at this point in the history
  8. Copy the full SHA
    382ad9a View commit details
    Browse the repository at this point in the history
  9. Copy the full SHA
    41732cd View commit details
    Browse the repository at this point in the history
  10. Copy the full SHA
    b0de2d6 View commit details
    Browse the repository at this point in the history
  11. Copy the full SHA
    67bc4bb View commit details
    Browse the repository at this point in the history
  12. Make create_language_model a bit more reliable, and make it dump wiki…

    …pedia text so we do not have to download it a bunch of times
    dan-blanchard committed Dec 12, 2020
    Copy the full SHA
    7a3636c View commit details
    Browse the repository at this point in the history
  13. Copy the full SHA
    d45f781 View commit details
    Browse the repository at this point in the history
  14. Copy the full SHA
    4503b74 View commit details
    Browse the repository at this point in the history
  15. Black formatting

    dan-blanchard committed Dec 12, 2020
    Copy the full SHA
    eac7414 View commit details
    Browse the repository at this point in the history