Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Jan can load large model with multiple gguf files #2898

Open
hahuyhoang411 opened this issue May 14, 2024 · 1 comment
Open

feat: Jan can load large model with multiple gguf files #2898

hahuyhoang411 opened this issue May 14, 2024 · 1 comment
Assignees
Labels
P1: important Important feature / fix roadmap: Cortex Cortex, Cortex llama cpp, core extensions type: feature request A new feature

Comments

@hahuyhoang411
Copy link
Contributor

Problem
Jan is only support 1 gguf model file at a time

Success Criteria
We can help users to merge gguf files into 1 and load the model for them

Additional context
Approach
https://www.reddit.com/r/LocalLLaMA/comments/1cf6n18/how_to_use_merge_70b_split_model_ggufpart1of2/

@hahuyhoang411 hahuyhoang411 added the type: feature request A new feature label May 14, 2024
@Van-QA Van-QA added this to the v. Ochazuke milestone May 14, 2024
@SwiftIllusion
Copy link

Would also appreciate this as I have ran into the same limitation when trying to use the larger split models here - https://huggingface.co/MaziyarPanahi/WizardLM-2-8x22B-GGUF#load-sharded-model - where it was specifically mentioned to load them as shared and not combine the files.
Another reddit thread mentioning this https://www.reddit.com/r/LocalLLaMA/comments/1c2dfv6/loading_multipart_gguf_files_in/ referenced a resolution of this for text-generation-webui oobabooga/text-generation-webui@e158299 (just for context and the steps to make it compatible here naturally may be different).

@0xSage 0xSage added the P1: important Important feature / fix label May 22, 2024
@Van-QA Van-QA added the roadmap: Cortex Cortex, Cortex llama cpp, core extensions label May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P1: important Important feature / fix roadmap: Cortex Cortex, Cortex llama cpp, core extensions type: feature request A new feature
Projects
Status: Planned
Development

No branches or pull requests

5 participants