Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes #17639

haileyschoelkopf · 2022-06-09T18:51:09Z

What does this PR do?

This PR adds 2 new classes for the BLOOM model with sequence classification and token classification heads. Mentioned briefly by me in PR #17474 .

We are planning to use the smaller BLOOM models for these tasks downstream in the Bigscience Multilingual Modeling WG, so we need these classes implemented to do so.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case). -> NO
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests? -> TODO

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Will tag patrickvonplaten and sgugger when ready--still need to write tests first!

Let me know if anything is wrong with this PR!

HuggingFaceDocBuilderDev · 2022-06-09T23:04:27Z

The documentation is not available anymore as the PR was closed or merged.

haileyschoelkopf · 2022-06-10T21:58:54Z

@patrickvonplaten @sgugger

This PR should be ready for review.

There are tests failing, but I am not sure why--I think I've looked at all the failing ones, and their failure seems to be unrelated to anything I changed (ie. image segmentation tests and tokenization tests fail, and text generation pipeline test_small_model_pt seems to fail because "sshleifer/tiny-ctrl" cannot be downloaded, none of which relate to files I touched in this PR afaik.) Hopefully I'm not mistaken on this end!

sgugger

LGTM once all tests pass!
We had an outage on the Hub last Friday and this weekend, relaunched all tests which should remove all flaky failures :-)

src/transformers/models/bloom/modeling_bloom.py

younesbelkada · 2022-06-14T09:23:15Z

Hi @haileyschoelkopf , I have managed to fix some tests that fails on this PR, can I directly push to this PR in case the tests still fail on your side?

haileyschoelkopf · 2022-06-14T12:08:35Z

Ah thank you @younesbelkada , you got to fixing the tests before I had time to do it!

I think that you can push to this PR already (you're marked as a maintainer right?) so feel free to do so :)

- more tests should pass - one test left

younesbelkada · 2022-06-14T12:14:40Z

Perfect thanks! Just pushed my commit

…yschoelkopf/transformers into bloom_classification_heads

haileyschoelkopf · 2022-06-14T13:39:44Z

After a minor change to return type of BloomForTokenClassification , all tests pass!

This should be ready to merge now :)

src/transformers/models/bloom/modeling_bloom.py

younesbelkada

Left a very small comment! Otherwise looks very good to me! Thank you very much for helping us implementing these classes

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

haileyschoelkopf · 2022-06-14T14:05:26Z

Comments should be resolved now!

younesbelkada · 2022-06-14T14:10:09Z

Waiting for the last lights to be green 🟢 and we'll merge !

stefan-it · 2022-06-17T14:55:31Z

Hi @haileyschoelkopf ,

thanks for adding this! Do you have any recommendations for a good hyper-param configuration (batch size, learning rate) when doing NER? I tried CoNLL-2003 with the bigscience/bloom-350m checkpoint, but results are around 67% on test set, which is very bad. (I tried epoch = 10, learning rate = 5e-06 and batch size = 4, that was working well with XLM-R Large and it took 114 minutes on a V100 using fp16).

younesbelkada · 2022-06-17T14:59:53Z

I don't know if this is related but, bloom-350m's pre-training has not finished yet! You may want to try it with bloom-1b3 which is a model where the pre-training has been completed! https://huggingface.co/bigscience/bloom-1b3

haileyschoelkopf · 2022-06-17T18:04:48Z

Hi @stefan-it, I agree with what @younesbelkada said, bloom-1b3 is worth trying since it's finished pretraining! We haven't actually gotten to NER experiments but I'll let you know if we do end up finding good hyperparams.

stefan-it · 2022-06-17T22:10:17Z

Hi @younesbelkada and @haileyschoelkopf thanks for that hint! I used the bigscience/bloom-1b3 model with DeepSpeed and the result after one epoch of fine-tuning (same hyper-params as mentioned above) are also very bad. So I'm going to tune the hyper-params and please let me know if you found a working setup for your NER task(s) 🤗

haileyschoelkopf added 2 commits June 9, 2022 14:27

add new bloom classes

35996fc

(feat) add bloom classification tests; make style

56b5c16

haileyschoelkopf added 2 commits June 10, 2022 15:23

style: change import in test

ccba202

add some typehints to bloom classes

bf546ee

haileyschoelkopf marked this pull request as ready for review June 10, 2022 21:58

sgugger approved these changes Jun 13, 2022

View reviewed changes

src/transformers/models/bloom/modeling_bloom.py Show resolved Hide resolved

haileyschoelkopf and others added 6 commits June 13, 2022 11:53

merge main into branch

5656360

fix merge conflicts

8c0ab2c

fix: input checking in bloom seq classification

4134e49

fix tests

8402089

change model class tests

bfc6d49

Merge branch 'main' into bloom_classification_heads

331ad3c

fix few tests

d9fc6fb

- more tests should pass - one test left

haileyschoelkopf added 2 commits June 14, 2022 09:02

make token classifier return hidden states

a41b278

Merge branch 'bloom_classification_heads' of https://github.com/haile…

1d59140

…yschoelkopf/transformers into bloom_classification_heads

haileyschoelkopf requested a review from younesbelkada June 14, 2022 13:40

younesbelkada reviewed Jun 14, 2022

View reviewed changes

src/transformers/models/bloom/modeling_bloom.py Outdated Show resolved Hide resolved

younesbelkada reviewed Jun 14, 2022

View reviewed changes

src/transformers/models/bloom/modeling_bloom.py Show resolved Hide resolved

younesbelkada approved these changes Jun 14, 2022

View reviewed changes

style: make BLOOM typehints consistent

194996e

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

younesbelkada changed the title ~~[WIP] Add BloomForSequenceClassification and BloomForTokenClassification classes~~ Add BloomForSequenceClassification and BloomForTokenClassification classes Jun 14, 2022

younesbelkada merged commit edb672a into huggingface:main Jun 14, 2022

haileyschoelkopf deleted the bloom_classification_heads branch June 14, 2022 15:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes #17639

Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes #17639

haileyschoelkopf commented Jun 9, 2022 •

edited

HuggingFaceDocBuilderDev commented Jun 9, 2022 •

edited

haileyschoelkopf commented Jun 10, 2022 •

edited

sgugger left a comment

younesbelkada commented Jun 14, 2022 •

edited

haileyschoelkopf commented Jun 14, 2022

younesbelkada commented Jun 14, 2022

haileyschoelkopf commented Jun 14, 2022 •

edited

younesbelkada left a comment

haileyschoelkopf commented Jun 14, 2022

younesbelkada commented Jun 14, 2022

stefan-it commented Jun 17, 2022 •

edited

younesbelkada commented Jun 17, 2022

haileyschoelkopf commented Jun 17, 2022

stefan-it commented Jun 17, 2022

Add BloomForSequenceClassification and BloomForTokenClassification classes #17639

Add BloomForSequenceClassification and BloomForTokenClassification classes #17639

Conversation

haileyschoelkopf commented Jun 9, 2022 • edited

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Jun 9, 2022 • edited

haileyschoelkopf commented Jun 10, 2022 • edited

sgugger left a comment

Choose a reason for hiding this comment

younesbelkada commented Jun 14, 2022 • edited

haileyschoelkopf commented Jun 14, 2022

younesbelkada commented Jun 14, 2022

haileyschoelkopf commented Jun 14, 2022 • edited

younesbelkada left a comment

Choose a reason for hiding this comment

haileyschoelkopf commented Jun 14, 2022

younesbelkada commented Jun 14, 2022

stefan-it commented Jun 17, 2022 • edited

younesbelkada commented Jun 17, 2022

haileyschoelkopf commented Jun 17, 2022

stefan-it commented Jun 17, 2022

Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes #17639

Add `BloomForSequenceClassification` and `BloomForTokenClassification` classes #17639

haileyschoelkopf commented Jun 9, 2022 •

edited

HuggingFaceDocBuilderDev commented Jun 9, 2022 •

edited

haileyschoelkopf commented Jun 10, 2022 •

edited

younesbelkada commented Jun 14, 2022 •

edited

haileyschoelkopf commented Jun 14, 2022 •

edited

stefan-it commented Jun 17, 2022 •

edited