The BLOOM model has been proposed with its various versions through the BigScience Workshop. BigScience is inspired by other open science initiatives where researchers have pooled their time and resources to collectively achieve a higher impact. The architecture of BLOOM is essentially similar to GPT3 (auto-regressive model for next token prediction), but has been trained on different 46 languages including code. Several smaller versions of the models have been trained on the same dataset. BLOOM is available in the following versions:
- bloom-350m
- bloom-760m
- bloom-1b3
- bloom-2b5
- bloom-6b3
- bloom (175B parameters)
[[autodoc]] BloomConfig - all
[[autodoc]] BloomModel - forward
[[autodoc]] BloomTokenizerFast - all
[[autodoc]] BloomForCausalLM - forward