[WIP] Add Jukebox model #16875

ArthurZucker · 2022-04-21T13:30:35Z

This is a draft pull request.

What does this PR do?

This PR will progressively add the Jukebox model to the hub.
It is linked to #16870.

Currently planned steps (WIP)

Create template files with transformeres-cli add-new-model-like
src/transformers/tokenization_jukebox.py
src/transformers/test_tokenization_jukebox.py
src/transformers/configuration_jukebox.py
src/transformers/modeling_jukebox.py
src/transformers/configuration_jukebox.py
docs/source/model_doc/jukebox.rst
src/transformers/tokenization_jukebox_fast.py (will most probably use WordLevel tokenizer). Also requires to implement a converter function class JukeboxConverter(Converter):

ArthurZucker · 2022-04-27T13:28:32Z

Tokenizer and corresponding test should be done. Lacking some detailed description and also probably something about the arguments in the init that are not data but I don't remember if I should create setters (@patrickvonplaten would love to have your review)

tests/jukebox/test_tokenization_jukebox.py

patrickvonplaten · 2022-05-02T17:20:06Z

Cool nice to see much progress here!

Feel free to also add a file that shows how you compare OpenAI's original to the current (HF) implementation

* fix report cat path * fix report cat path Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

* Add onnx configuration for bigbird-pegasus * Modify docs

* split single_gpu and multi_gpu * update needs in send_result Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

… in case of overflowing tokens (huggingface#17092) * add get_overflowing_images function to ensure 1-to-1 mapping between samples and images in LayoutLMv2Processor * make style * add test for overflowing_tokens, change assert to ValueError, avoiding unrelated formatting changes * change line length by passing --preview into black

…ggingface#17123) * Add type hints for remaining BigBirdPegasus models Here I added type hints to the BigBirdPegasusForCausalLM class. * Add missing type hints for Data2VecText models Added type hints to the Data2VecTextForCausalLM, Data2VecTextForMaskedLM, Data2VecTextForMultipleChoice, Data2VecTextForQuestionAnswering, Data2VecTextForSequenceClassification, and Data2VecTextForTokenClassification classes.

* update docs of length_penalty * Revert "update docs of length_penalty" This reverts commit 466bf48. * add mobilebert onnx config * address suggestions * Update auto.mdx * Update __init__.py * Update features.py

* PyTorch FSDP integration in Trainer * reformatting make style and make quality are now compliant. * Updating dependency check * Trigger CI Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

…ith try-except (huggingface#16578) * rebase and isort * modify cookiecutter init * fix cookiecutter auto imports * fix clean_frameworks_in_init * fix add_model_to_main_init * blackify * replace unnecessary f-strings * update yolos imports * fix roberta import bug * fix yolos missing dependency * fix add_model_like and cookiecutter bug * fix repository consistency error * modify cookiecutter, fix add_new_model_like * remove stale line Co-authored-by: Dom Miketa <dmiketa@exscientia.co.uk>

…uggingface#17068) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> - Adds auto_batch_size finder - Moves training loop to an inner training loop

…huggingface#17130) * ensure mlflow.end_run() is executed at end of training when mlflow.start_run() was executed by the callback * add debug msg * add support for MLFLOW_TAGS, MLFLOW_RUN_ID, and MLFLOW_NESTED_RUN * update to support python 3.6+ * Validate env variables using ENV_VARS_TRUE_VALUES * Empty-Commit

* LogSumExp trick `question_answering` pipeline. * Adding a failing test.

…rmers into add_jukebox

patrickvonplaten · 2022-05-30T12:25:16Z

What happened to the git commit history here?

ArthurZucker · 2022-05-30T12:41:03Z

I rebased instead of merging 🤕 Will create a new PR to replace that one

ArthurZucker · 2022-06-22T17:45:13Z

See followup in #17826

ArthurZucker linked an issue Apr 21, 2022 that may be closed by this pull request

OpenAI's Jukebox for music generation #16870

Closed

2 tasks

ArthurZucker self-assigned this Apr 26, 2022

ArthurZucker added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Apr 26, 2022

ArthurZucker added this to In progress in New model additions via automation Apr 26, 2022

ArthurZucker requested a review from patrickvonplaten April 27, 2022 13:26

patrickvonplaten reviewed Apr 27, 2022

View reviewed changes

tests/jukebox/test_tokenization_jukebox.py Outdated Show resolved Hide resolved

patrickvonplaten reviewed Apr 27, 2022

View reviewed changes

tests/jukebox/test_tokenization_jukebox.py Outdated Show resolved Hide resolved

ydshieh and others added 21 commits May 6, 2022 07:45

Fix self-push CI report path in cat (huggingface#17111)

351cdbd

* fix report cat path * fix report cat path Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

Added BigBirdPegasus onnx config (huggingface#17104)

215e068

* Add onnx configuration for bigbird-pegasus * Modify docs

fix hp bug

3ff2ed2

update test file to loead dummy weights for testing

5f84cf1

templat convert file

7335fbf

make style

f7c375e

update tokenizer and paths + quality

199e032

update tokenizer to correct config

045ee7b

cleqn init

9faad5f

split single_gpu and multi_gpu (huggingface#17083)

3212afa

* split single_gpu and multi_gpu * update needs in send_result Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>

add mobilebert onnx configs (huggingface#17029)

dc3645d

* update docs of length_penalty * Revert "update docs of length_penalty" This reverts commit 466bf48. * add mobilebert onnx config * address suggestions * Update auto.mdx * Update __init__.py * Update features.py

PyTorch FSDP integration in Trainer (huggingface#17136)

05fc176

* PyTorch FSDP integration in Trainer * reformatting make style and make quality are now compliant. * Updating dependency check * Trigger CI Co-authored-by: Sylvain Gugger <Sylvain.gugger@gmail.com>

Fix quality and repo consistency

7783fa6

Add the auto_find_batch_size capability from Accelerate into Trainer (h…

2fbb237

…uggingface#17068) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> - Adds auto_batch_size finder - Moves training loop to an inner training loop

Fix all docs for accelerate install directions (huggingface#17145)

d719bcd

LogSumExp trick question_answering pipeline. (huggingface#17143)

6d80c92

* LogSumExp trick `question_answering` pipeline. * Adding a failing test.

Debugged different outputa

a9e8517

ArthurZucker and others added 8 commits May 30, 2022 08:45

make style

7b8da50

make quality

a8b124f

fix duplicate

61d78b6

Merge branch 'add_jukebox' of https://github.com/ArthurZucker/transfo…

37c3f9a

…rmers into add_jukebox

style and re-ordering

02f2eda

moved jukebox test

d767f79

fixup and copies

a15b851

udpate

1c9d346

ArthurZucker and others added 13 commits May 30, 2022 15:39

update scripts

573f0af

remove unused and wrong import

e185653

update

b5ac1da

update test

910cf3f

begin gpu support

97e8162

update device

8fd494e

update

e836d18

style

23a64d4

update test

68eef31

updatex

72f4210

update tests parameters

106d179

test from the notebook

a8f87b7

Merge Main

be6a271

ArthurZucker mentioned this pull request Jun 22, 2022

Add Jukebox model (replaces #16875) #17826

Merged

8 tasks

ArthurZucker closed this Jun 22, 2022

ArthurZucker removed this from In progress in New model additions Nov 8, 2022

ArthurZucker added a commit that referenced this pull request Nov 10, 2022

Add Jukebox model (replaces #16875) (#17826)

61a51f5

amyeroberts pushed a commit to amyeroberts/transformers that referenced this pull request Nov 14, 2022

Add Jukebox model (replaces huggingface#16875) (huggingface#17826)

fa969e6

mpierrau pushed a commit to mpierrau/transformers that referenced this pull request Dec 15, 2022

Add Jukebox model (replaces huggingface#16875) (huggingface#17826)

95a0fcd

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Add Jukebox model #16875

[WIP] Add Jukebox model #16875

ArthurZucker commented Apr 21, 2022 •

edited

ArthurZucker commented Apr 27, 2022

patrickvonplaten commented May 2, 2022 •

edited

patrickvonplaten commented May 30, 2022

ArthurZucker commented May 30, 2022

ArthurZucker commented Jun 22, 2022

[WIP] Add Jukebox model #16875

[WIP] Add Jukebox model #16875

Conversation

ArthurZucker commented Apr 21, 2022 • edited

What does this PR do?

Currently planned steps (WIP)

ArthurZucker commented Apr 27, 2022

patrickvonplaten commented May 2, 2022 • edited

patrickvonplaten commented May 30, 2022

ArthurZucker commented May 30, 2022

ArthurZucker commented Jun 22, 2022

ArthurZucker commented Apr 21, 2022 •

edited

patrickvonplaten commented May 2, 2022 •

edited