-
Notifications
You must be signed in to change notification settings - Fork 154
Issues: databricks/megablocks
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Cloning input
x
in megablocks.layers.glu.SparseGLU
leads to different SDD outputs
#115
opened May 28, 2024 by
cmsflash
support amd/rocm
enhancement
New feature or request
help wanted
Extra attention is needed
#97
opened Mar 21, 2024 by
ehartford
selective router precision
question
Further information is requested
#91
opened Jan 14, 2024 by
152334H
Does this framework support SFT?
question
Further information is requested
#90
opened Jan 12, 2024 by
banksy23
RuntimeError: Triton Error [CUDA]: invalid argument
question
Further information is requested
#88
opened Jan 10, 2024 by
noob-ctrl
different load_balancing_loss with different pipeline_parallel_size
question
Further information is requested
#85
opened Jan 5, 2024 by
bozheng-hit
How to integrate to transformers-based mixtral
question
Further information is requested
#84
opened Jan 3, 2024 by
nxphi47
ParallelDroplessMLP initialises self.mlp twice
enhancement
New feature or request
help wanted
Extra attention is needed
#83
opened Jan 1, 2024 by
152334H
Why the second matrix of the mlp layer has the same shape of the first one?
question
Further information is requested
#81
opened Dec 29, 2023 by
gouchangjiang
[BUG] Optimizer Weights Not Reloaded When Training with bf16 Pretrained Weights
bug
Something isn't working
#80
opened Dec 26, 2023 by
RookieHong
Script for Full Fine-Tuning of Mixtral
question
Further information is requested
#68
opened Dec 20, 2023 by
alpayariyak
Previous Next
ProTip!
Adding no:label will show everything without a label.