New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How I can find all the checkpoints and merge it manually? (Lora) #922
Comments
Hi @monk1337 thanks for the issue! Glad to hear you're finding the library useful. To clarify, are you interested in storing just the LoRA weights from the end of each epoch so that you can compare evaluations across different epochs? To give a bit more info.. we will output two checkpoints at the end of each epoch to your output directory: for epoch For evaluation, we also have an integration with EleutherAI's eval harness so you are welcome to use that if you like. If you want more details on how to do this you can check out this section of our end-to-end tutorial. Let me know if this makes sense or if there's something else you're looking for here, happy to address any follow-ups you may have. |
@ebsmothers Thank you for your detailed reply. I have one follow-up question, How can I convert this merge model folder which contains multiple
|
Hi @monk1337 this is a good question. The format we output should generally adhere to the same format as the inputs (i.e. the logic for distributing weights across files should line up exactly). So in this case the format should still match HF. The main difference would be usage of safetensors (as you pointed out, we do write out to |
Great job guys for this awesome tool. I have just started using this and loving it already. I have one question, I am fine-tuning for 6 epochs and want to store each checkpoint separately. Later I would like to evaluate each checkpoint, how can I do that?
The text was updated successfully, but these errors were encountered: