Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with huggingface trainer, or even direct pytorch training #2627

Open
ydennisy opened this issue May 1, 2024 · 3 comments
Open

Comments

@ydennisy
Copy link

ydennisy commented May 1, 2024

Hello @tomaarsen

Firstly sorry to ping you directly, but also a big thank you for your and the other contributor work on this project!

What I am going to ask about is not the first time it has been asked but I wanted to bring this back to your attention.

I feel sentence-transformers is an excellent library for inference and quick prototyping when you need embeddings, but as soon as any fine tuning or model changes are needed I feel the API is clunky, mainly because it is non-standard to more established tooling. So in short ideally I would like to be able to use the HF trainer and also a direct pytorch training loop to fine tune and analyse models.

Is there any reason this is not something you feel would be very valuable?

Happy to elaborate on the reasons, but this is mainly to do with tracking metrics such as loss in tooling for example w&b.

Thanks in advance!
D

@tomaarsen
Copy link
Collaborator

Hello!

I have a great surprise for you: a v3 pre-release with essentially your proposed plan is already ready, just waiting on additional documentation before it's released. There isn't much documentation on it yet, but this will be the general training loop:

from datasets import load_dataset
from sentence_transformers import SentenceTransformer, SentenceTransformerTrainer
from sentence_transformers.losses import MultipleNegativesRankingLoss

# 1. Load a model to finetune
model = SentenceTransformer("microsoft/mpnet-base")

# 2. Load a dataset to finetune on
dataset = load_dataset("sentence-transformers/all-nli", "pair")
train_dataset = dataset["train"]
eval_dataset = dataset["dev"]

# 3. Define a loss function
loss = MultipleNegativesRankingLoss(model)

# 4. Create a trainer & train
trainer = SentenceTransformerTrainer(
    model=model,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    loss=loss,
)
trainer.train()

# 5. Save the trained model
model.save("models/mpnet-base-all-nli")

The new SentenceTransformerTrainer subclasses the HF Trainer, so training should be very familiar if you know how that Trainer works. See #2449 for more info on the new training loop. So, yes, this new Trainer has direct integrations with W&B and Tensorboard. It also introduces training & evaluation loss logging, which has been missing.

Additionally, this message has 3 advanced training scripts and this message has 2 advanced training scripts. Also, #2622 has a bunch more training scripts.

Here are some example models produced by these training scripts:


As for the

a direct pytorch training loop to fine tune and analyse models.

I think I will leave this to the "advanced users", as some people tend to prefer to train "their way". That will continue to be possible, albeit perhaps with some hacks. There are some challenges with the current API of a SentenceTransformer object that I can't change without pretty major repercussions in third party applications that rely on Sentence Transformers.

  • Tom Aarsen

@harry7171
Copy link

Hi @tomaarsen .
This is great, was going to raise an issue for the same.

Can you let me know when do we have it (version with SentenceTransformerTrainer) released, as I am planning to use it in current ongoing workstream.

Thanks in advance

@tomaarsen
Copy link
Collaborator

Hello!

The current goal is to release in around 1.5-2 weeks. All that remains for now is some bugfixing & (re)writing documentation.
The v3.0-pre-release branch already closely resembles what will eventually be released.

  • Tom Aarsen

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants