Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MS Marco Cross Encoder Evaluation? #2638

Open
yildize opened this issue May 12, 2024 · 0 comments
Open

MS Marco Cross Encoder Evaluation? #2638

yildize opened this issue May 12, 2024 · 0 comments

Comments

@yildize
Copy link

yildize commented May 12, 2024

Hello there I have some questions regarding MS Marco Cross Encoder Evaluation.

In the docs: https://www.sbert.net/docs/pretrained-models/msmarco-v3.html

  • MRR@10 (MS Marco Dev) scores are noted for different models.

On the training/ms_marco section: https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco

  • train_cross-encoder_scratch.py shows how to train a cross-encoder from scratch, but the evaluation here uses https://sbert.net/datasets/msmarco-qidpidtriples.rnd-shuf.train-eval.tsv.gz which is I guess just a small train_dev dataset having 500 queries each query consisting of 1-3 positive and 500 negative examples. At the end 200 example queries each with 1-3 positive and 200 negative passages are constructed as train_dev set.

Question 1
Are MRR@10 (MS Marco Dev) scores mentioned on the https://www.sbert.net/docs/pretrained-models/msmarco-v3.html uses the same train_dev set on the mentioned training code above or do you use some other dev dataset? How can I find this actual evaluation dataset and evaluation code to reproduce your results?

More specifically, I want to see how dp your models compare to other alternatives on MS Marco but I am not sure how to formally evaluate those?. E.g which exact dev dataset to use, and do we use 200 negatives as you've done in your training code or something else?

Question 2
There are some extra eval codes on the repo namely: eval_msmarco.py and eval_cross-encoder-trec-dl.py but I guess the first code "eval_msmarco" is only for bi-encoders am I right? And the second code is for evaluation of cross-encoder on trec?

Thanks in advance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant