MS Marco Cross Encoder Training #2639

yildize · 2024-05-12T14:04:45Z

I have some questions regarding MS Marco cross encoder trainings:

Question 1
As far as I can see from the docs many of your bi-encoder training methods rely heavily on the performance of cross-encoders and thus, the cross-encoders (e.g finding hard negatives, pseudo-labeling or more...) and still, I see very limited resource on cross-encoder training on the doc.

Is this: "https://github.com/UKPLab/sentence-transformers/tree/master/examples/training/ms_marco" the only part you've mentioned on training cross-encoders, or am I missing more?

Question 2
As far as I can get from the training code "train_cross-encoder_scratch.py" we train a binary classifier with examples like (query, negative_passage, 0) or (query, positive_passage, 1) and you use negative to positive ratio of 4.

But I am confused on a subject, I guess the ms_marco dataset can contain false negatives right? So an arbitrary (query, negative_passage, 0) can actually be a false negative sample? Isn't it problematic and cause performance degradations? Am I missing something here?

Question 3
Where did this 4/1 negative to positive ratio come from? Are there any other alternative training methods? Do you have a paper for this cross encoder training?

Question 4
Do you have a guide to train a cross-encoder (not a bi-encoder) on another language?

Question 5
Do you have a guide to fine-tune a cross-encoder (not a bi-encoder) for domain adaptation?

Thanks in advance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MS Marco Cross Encoder Training #2639

MS Marco Cross Encoder Training #2639

yildize commented May 12, 2024 •

edited

MS Marco Cross Encoder Training #2639

MS Marco Cross Encoder Training #2639

Comments

yildize commented May 12, 2024 • edited

yildize commented May 12, 2024 •

edited