Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ALTI+ implementation #200

Open
gsarti opened this issue Jul 10, 2023 · 2 comments
Open

Add ALTI+ implementation #200

gsarti opened this issue Jul 10, 2023 · 2 comments
Labels
enhancement New feature or request

Comments

@gsarti
Copy link
Member

gsarti commented Jul 10, 2023

Description

The ALTI+ method is an extension of ALTI for encoder-decoder (and by extension, decoder-only) models.

Authors: @gegallego @javiferran

Implementation notes:

  • The current implementation extracts input features for key, query and value projections and computes intermediate steps using the Kobayashi refactoring to obtain the transformed vectors used in the final ALTi computation.

  • The computation of attention layer outputs is carried on up to the resultant (i.e. the actual output of the attention layer) in order to check that the result matches the original output of the attention layer forward pass. This is only done for sanity checking purposes but it's not especially heavy from a computational perspective, so it can be preserved (e.g. raise an error if the outputs doesn't match to signal the model is maybe not supported)

  • Focusing on GPT-2 as an example model, the per-head attention weights and outputs (i.e. matmul of weights and value vectors) are returned here so they can be extracted with a hook and used to compute the transformed vectors needed for ALTI.

  • Pre- and post-layer norm models are handled differently because the transformed vectors are the final outputs of the attention block, regardless of the position of the layer norm (it needs to be included in any case). In the Kobayashi decomposition of the attention layer the bias component needs to be separate both for the layer norm and the output projections, so we need to make sure whether this is possible out of the box, or it needs to be computed in an ad-hoc hook.

  • If we are interested in the output vectors before the bias is added, we can extract the bias vector alongside the output of the attention module and subtract the former from the latter.

  • For aggregating ALTI+ scores in order to obtain overall importance we will use the extended rollout implementation that is currently being developed in Value Zeroing attribution method #173.

Refererence implementation mt-upc/transformer-contributions-nmt

@gsarti gsarti added the enhancement New feature or request label Jul 10, 2023
@gsarti gsarti mentioned this issue Aug 14, 2023
2 tasks
@frankdarkluo
Copy link

frankdarkluo commented Apr 11, 2024

Is this ALTI+ implementation integrated yet?

@gsarti
Copy link
Member Author

gsarti commented Apr 11, 2024

Hi @frankdarkluo, sadly not yet! But there is a WIP PR for it here: #217

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants