Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

model has multi checkpoint file can't be loaded to train #934

Closed
apachemycat opened this issue May 4, 2024 · 2 comments
Closed

model has multi checkpoint file can't be loaded to train #934

apachemycat opened this issue May 4, 2024 · 2 comments

Comments

@apachemycat
Copy link

ls /models/meta-Llama-3-8B
LICENSE config.json
model-00001-of-00004.safetensors
model-00004-of-00004.safetensors tokenizer.json
README.md generation_config.json
model-00002-of-00004.safetensors
model.safetensors.index.json tokenizer_config.json
USE_POLICY.md ggml-model-f16.gguf
model-00003-of-00004.safetensors
special_tokens_map.json


checkpointer:
component: torchtune.utils.FullModelMetaCheckpointer
checkpoint_dir: /models/meta-Llama-3-8B
checkpoint_files: [
model-00001-of-00004.safetensors,
model-00002-of-00004.safetensors,
model-00003-of-00004.safetensors,
model-00004-of-00004.safetensors
]
recipe_checkpoint: null
output_dir: /tmp/Meta-Llama-3-8B/
model_type: LLAMA3
resume_from_checkpoint: False


ValueError: Currently we only support reading from a single TorchTune checkpoint file. Got 4 files instead.

@kartikayk
Copy link
Contributor

Thanks for opening this issue!

safetensors is a format from HF which usually contains HF formatted checkpoints. To make this work you, need to update the checkpointer component in the config from FullModelMetaCheckpointer to FullModelHFCheckpointer. This will load the checkpoint AND do the necessary conversions needed to correctly interpret the weight tensors. You can read this deepdive to better understand checkpoint formats and how torchtune deals with them.

@kartikayk
Copy link
Contributor

@apachemycat let me know if this is still an issue and I'd be happy to reopen this! If not, I'm closing this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants