Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compatibility testing of supported NLP models #622

Open
joshdevins opened this issue Oct 12, 2023 · 1 comment
Open

Improve compatibility testing of supported NLP models #622

joshdevins opened this issue Oct 12, 2023 · 1 comment
Assignees
Labels
topic:NLP Issue or PR about NLP model support and eland_import_hub_model

Comments

@joshdevins
Copy link
Member

Today we rely mostly on unit testing for the PyTorch/NLP model import testing. We perform large scale testing as part of other components like Elasticsearch, but we often find bugs later only and can't tie them to specific changes in eland (e.g. to a specific PR). We'd like to improve integration testing in eland by performing a test matrix of models+multiple Elasticsearch versions. For each model, we'd test multiple inputs of various lengths, up to and beyond each model's input limit, and validate the inference results from Elasticsearch directly against results from transformers as ground truth. Tests should run as part of the normal CI cycle and need to pass before a PR can be merged.

More details to follow in this issue such as the list of models to test.

@joshdevins joshdevins added the topic:NLP Issue or PR about NLP model support and eland_import_hub_model label Oct 12, 2023
@joshdevins
Copy link
Member Author

joshdevins commented Oct 12, 2023

The following is a list of models that we wish to verify compatibility with, per-task type. The list is based off of the base models and tokenizers that we support, and the tasks we support.

  • fill_mask

    • bert-base-uncased
    • distilbert-base-uncased
    • distilroberta-base
    • roberta-base
    • xlm-roberta-base
    • cl-tohoku/bert-base-japanese
    • google/electra-base-discriminator
    • google/mobilebert-uncased
    • facebook/bart-base
    • microsoft/mpnet-base
    • squeezebert/squeezebert-uncased
  • ner

    • dbmdz/bert-large-cased-finetuned-conll03-english
    • dslim/bert-base-NER
    • elastic/distilbert-base-cased-finetuned-conll03-english
    • elastic/distilbert-base-uncased-finetuned-conll03-english
  • text_expansion

  • text_classification

    • distilbert-base-uncased-finetuned-sst-2-english
    • SamLowe/roberta-base-go_emotions
    • cardiffnlp/twitter-roberta-base-irony
    • ProsusAI/finbert
    • j-hartmann/emotion-english-distilroberta-base
    • cardiffnlp/twitter-roberta-base-sentiment-latest
    • roberta-large-mnli
    • huggingface/distilbert-base-uncased-finetuned-mnli
  • text_embedding

    • intfloat/e5-small-v2
    • intfloat/e5-base-v2
    • intfloat/e5-large-v2
    • intfloat/multilingual-e5-small
    • intfloat/multilingual-e5-base
    • intfloat/multilingual-e5-large
    • BAAI/bge-small-en-v1.5
    • BAAI/bge-base-en-v1.5
    • BAAI/bge-large-en-v1.5
    • thenlper/gte-small
    • thenlper/gte-base
    • thenlper/gte-large
    • sentence-transformers/all-mpnet-base-v2
    • sentence-transformers/all-MiniLM-L6-v2
    • sentence-transformers/all-MiniLM-L12-v2
    • sentence-transformers/all-distilroberta-v1
    • sentence-transformers/multi-qa-MiniLM-L6-cos-v1
    • sentence-transformers/multi-qa-mpnet-base-dot-v1
    • sentence-transformers/paraphrase-multilingual-mpnet-base-v2
    • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    • sentence-transformers/paraphrase-MiniLM-L3-v2
    • sentence-transformers/paraphrase-MiniLM-L6-v2
    • sentence-transformers/paraphrase-mpnet-base-v2
    • sentence-transformers/distiluse-base-multilingual-cased-v2
    • sentence-transformers/msmarco-distilbert-base-tas-b
    • sentence-transformers/msmarco-MiniLM-L12-cos-v5
    • sentence-transformers/LaBSE
    • sentence-transformers/facebook-dpr-ctx_encoder-single-nq-base
    • sentence-transformers/facebook-dpr-question_encoder-single-nq-base
    • sentence-transformers/facebook-dpr-ctx_encoder-multiset-base
    • sentence-transformers/facebook-dpr-question_encoder-multiset-base
    • facebook/dpr-ctx_encoder-single-nq-base
    • facebook/dpr-question_encoder-single-nq-base
    • facebook/dpr-ctx_encoder-multiset-base
    • facebook/dpr-question_encoder-multiset-base
  • zero_shot_classification

    • facebook/bart-large-mnli
    • vicgalle/xlm-roberta-large-xnli-anli
    • valhalla/distilbart-mnli-12-1
    • typeform/distilbert-base-uncased-mnli
  • question_answering

    • deepset/roberta-base-squad2
    • distilbert-base-uncased-distilled-squad
    • deepset/bert-large-uncased-whole-word-masking-squad2
  • text_similarity

    • cross-encoder/ms-marco-MiniLM-L-6-v2
    • cross-encoder/ms-marco-TinyBERT-L-2-v2

@joshdevins joshdevins changed the title Improve integration testing of uploaded NLP models Improve compatibility testing of supported NLP models Oct 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic:NLP Issue or PR about NLP model support and eland_import_hub_model
Projects
None yet
Development

No branches or pull requests

2 participants