New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BloomForSequenceClassification
and BloomForTokenClassification
classes
#17639
Add BloomForSequenceClassification
and BloomForTokenClassification
classes
#17639
Conversation
The documentation is not available anymore as the PR was closed or merged. |
This PR should be ready for review. There are tests failing, but I am not sure why--I think I've looked at all the failing ones, and their failure seems to be unrelated to anything I changed (ie. image segmentation tests and tokenization tests fail, and text generation pipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once all tests pass!
We had an outage on the Hub last Friday and this weekend, relaunched all tests which should remove all flaky failures :-)
Hi @haileyschoelkopf , I have managed to fix some tests that fails on this PR, can I directly push to this PR in case the tests still fail on your side? |
Ah thank you @younesbelkada , you got to fixing the tests before I had time to do it! I think that you can push to this PR already (you're marked as a maintainer right?) so feel free to do so :) |
- more tests should pass - one test left
Perfect thanks! Just pushed my commit |
After a minor change to return type of This should be ready to merge now :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a very small comment! Otherwise looks very good to me! Thank you very much for helping us implementing these classes
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Comments should be resolved now! |
Waiting for the last lights to be green 🟢 and we'll merge ! |
BloomForSequenceClassification
and BloomForTokenClassification
classesBloomForSequenceClassification
and BloomForTokenClassification
classes
Hi @haileyschoelkopf , thanks for adding this! Do you have any recommendations for a good hyper-param configuration (batch size, learning rate) when doing NER? I tried CoNLL-2003 with the |
I don't know if this is related but, bloom-350m's pre-training has not finished yet! You may want to try it with bloom-1b3 which is a model where the pre-training has been completed! https://huggingface.co/bigscience/bloom-1b3 |
Hi @stefan-it, I agree with what @younesbelkada said, bloom-1b3 is worth trying since it's finished pretraining! We haven't actually gotten to NER experiments but I'll let you know if we do end up finding good hyperparams. |
Hi @younesbelkada and @haileyschoelkopf thanks for that hint! I used the |
What does this PR do?
This PR adds 2 new classes for the BLOOM model with sequence classification and token classification heads. Mentioned briefly by me in PR #17474 .
We are planning to use the smaller BLOOM models for these tasks downstream in the Bigscience Multilingual Modeling WG, so we need these classes implemented to do so.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
Will tag patrickvonplaten and sgugger when ready--still need to write tests first!
Let me know if anything is wrong with this PR!