This file contains a high-level description of changes that were merged into the Inseq main branch since the last release. Refer to the releases page for an exhaustive overview of changes introduced at each release.
- Added new models
DbrxForCausalLM
,OlmoForCausalLM
,Phi3ForCausalLM
,Qwen2MoeForCausalLM
to model config.
-
Fix the issue in the attention implementation from #268 where non-terminal position in the tensor were set to nan if they were 0s (#269).
-
Fix the pad token in cases where it is not specified by default in the loaded model (e.g. for Qwen models) (#269).
-
Fix bug reported in #266 making
value_zeroing
unusable for SDPA attention. This enables using the method on models using SDPA attention as default (e.g.GemmaForCausalLM
) without passingmodel_kwargs={'attn_implementation': 'eager'}
(#267).
No changes
No changes