Improve compatibility testing of supported NLP models #622

joshdevins · 2023-10-12T10:12:48Z

Today we rely mostly on unit testing for the PyTorch/NLP model import testing. We perform large scale testing as part of other components like Elasticsearch, but we often find bugs later only and can't tie them to specific changes in eland (e.g. to a specific PR). We'd like to improve integration testing in eland by performing a test matrix of models+multiple Elasticsearch versions. For each model, we'd test multiple inputs of various lengths, up to and beyond each model's input limit, and validate the inference results from Elasticsearch directly against results from transformers as ground truth. Tests should run as part of the normal CI cycle and need to pass before a PR can be merged.

More details to follow in this issue such as the list of models to test.

The text was updated successfully, but these errors were encountered:

joshdevins · 2023-10-12T13:46:29Z

The following is a list of models that we wish to verify compatibility with, per-task type. The list is based off of the base models and tokenizers that we support, and the tasks we support.

fill_mask
- bert-base-uncased
- distilbert-base-uncased
- distilroberta-base
- roberta-base
- xlm-roberta-base
- cl-tohoku/bert-base-japanese
- google/electra-base-discriminator
- google/mobilebert-uncased
- facebook/bart-base
- microsoft/mpnet-base
- squeezebert/squeezebert-uncased
ner
- dbmdz/bert-large-cased-finetuned-conll03-english
- dslim/bert-base-NER
- elastic/distilbert-base-cased-finetuned-conll03-english
- elastic/distilbert-base-uncased-finetuned-conll03-english
text_expansion
text_classification
- distilbert-base-uncased-finetuned-sst-2-english
- SamLowe/roberta-base-go_emotions
- cardiffnlp/twitter-roberta-base-irony
- ProsusAI/finbert
- j-hartmann/emotion-english-distilroberta-base
- cardiffnlp/twitter-roberta-base-sentiment-latest
- roberta-large-mnli
- huggingface/distilbert-base-uncased-finetuned-mnli
text_embedding
- intfloat/e5-small-v2
- intfloat/e5-base-v2
- intfloat/e5-large-v2
- intfloat/multilingual-e5-small
- intfloat/multilingual-e5-base
- intfloat/multilingual-e5-large
- BAAI/bge-small-en-v1.5
- BAAI/bge-base-en-v1.5
- BAAI/bge-large-en-v1.5
- thenlper/gte-small
- thenlper/gte-base
- thenlper/gte-large
- sentence-transformers/all-mpnet-base-v2
- sentence-transformers/all-MiniLM-L6-v2
- sentence-transformers/all-MiniLM-L12-v2
- sentence-transformers/all-distilroberta-v1
- sentence-transformers/multi-qa-MiniLM-L6-cos-v1
- sentence-transformers/multi-qa-mpnet-base-dot-v1
- sentence-transformers/paraphrase-multilingual-mpnet-base-v2
- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- sentence-transformers/paraphrase-MiniLM-L3-v2
- sentence-transformers/paraphrase-MiniLM-L6-v2
- sentence-transformers/paraphrase-mpnet-base-v2
- sentence-transformers/distiluse-base-multilingual-cased-v2
- sentence-transformers/msmarco-distilbert-base-tas-b
- sentence-transformers/msmarco-MiniLM-L12-cos-v5
- sentence-transformers/LaBSE
- sentence-transformers/facebook-dpr-ctx_encoder-single-nq-base
- sentence-transformers/facebook-dpr-question_encoder-single-nq-base
- sentence-transformers/facebook-dpr-ctx_encoder-multiset-base
- sentence-transformers/facebook-dpr-question_encoder-multiset-base
- facebook/dpr-ctx_encoder-single-nq-base
- facebook/dpr-question_encoder-single-nq-base
- facebook/dpr-ctx_encoder-multiset-base
- facebook/dpr-question_encoder-multiset-base
zero_shot_classification
- facebook/bart-large-mnli
- vicgalle/xlm-roberta-large-xnli-anli
- valhalla/distilbart-mnli-12-1
- typeform/distilbert-base-uncased-mnli
question_answering
- deepset/roberta-base-squad2
- distilbert-base-uncased-distilled-squad
- deepset/bert-large-uncased-whole-word-masking-squad2
text_similarity
- cross-encoder/ms-marco-MiniLM-L-6-v2
- cross-encoder/ms-marco-TinyBERT-L-2-v2

joshdevins added the topic:NLP Issue or PR about NLP model support and eland_import_hub_model label Oct 12, 2023

joshdevins assigned pquentin Oct 12, 2023

joshdevins changed the title ~~Improve integration testing of uploaded NLP models~~ Improve compatibility testing of supported NLP models Oct 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve compatibility testing of supported NLP models #622

Improve compatibility testing of supported NLP models #622

joshdevins commented Oct 12, 2023

joshdevins commented Oct 12, 2023 •

edited

Improve compatibility testing of supported NLP models #622

Improve compatibility testing of supported NLP models #622

Comments

joshdevins commented Oct 12, 2023

joshdevins commented Oct 12, 2023 • edited

joshdevins commented Oct 12, 2023 •

edited