Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Summary] Add internals-based feature attribution methods #108

Open
gsarti opened this issue Nov 30, 2021 · 15 comments
Open

[Summary] Add internals-based feature attribution methods #108

gsarti opened this issue Nov 30, 2021 · 15 comments
Labels
enhancement New feature or request help wanted Extra attention is needed summary Summarizes multiple sub-tasks

Comments

@gsarti
Copy link
Member

gsarti commented Nov 30, 2021

🚀 Feature Request

The following is a non-exhaustive list of attention-based feature attribution methods that could be added to the library:

Method name Source Code implementation  Status
Last-Layer Attention Jain and Wallace '19 successar/AttentionExplanation
Aggregated Attention Jain and Wallace '19 successar/AttentionExplanation
Attention Flow Abnar and Zuidema '20 samiraabnar/attention_flow
Attention Rollout Abnar and Zuidema '20 samiraabnar/attention_flow
Attention with Values Norm (Attn-N) Kobayashi et al '20 gorokoba560/norm-analysis-of-transformer
Attention with Residual Norm (AttnRes-N) Kobayashi et al '20 gorokoba560/norm-analysis-of-transformer
Attention with Attention Block Norm (AttnResLn-N or LnAttnRes-N) Kobayashi et al '21 gorokoba560/norm-analysis-of-transformer
Attention-driven Relevance Propagation Chefer et al. '21 hila-chefer/Transformer-MM-Explainability
ALTI+ Ferrando et al '22 mt-upc/transformer-contributions-nmt
GlobEnc Modarressi et al. '22 mohsenfayyaz/globenc
Attention with Attention Block + FFN Norm (AttnResLnFF-N or LnAttnResFF-N) Kobayashi et al '23 -
Attention x Transformer Block Norm Kobayashi et al '23 -
Logit Ferrando et al '23 mt-upc/logit-explanations
ALTI-Logit Ferrando et al '23 mt-upc/logit-explanations
DecompX Modarressi et al '23 mohsenfayyaz/DecompX

Notes:

  1. Add the possibility to scale attention weights by the norm of value vectors, shown to be effective for alignment and encoder models (Ferrando and Costa-jussà '21, Treviso et al. '21)
  2. The ALTI+ technique is an extension of the ALTI method by Ferrando et al. '22 (paper, code) to Encoder-Decoder architectures. It was recently used by the Facebook team to detect hallucinated toxicity by highlighting toxic keywords paying attention to the source (NLLB paper, Figure 31).
  3. Attention Flow is very computationally expensive to compute but has proven SHAP guarantees for same-layer attribution, which is not the case for Rollout or other methods. Flow and rollout should be propagation methods rather than stand-alone approaches since they are used for most attention-based attributions.
  4. GlobEnc corresponds roughly to Attention x Transformer Block Norm but ignores the FFN part, that in the latter is incorporated by a localized application of Integrated Gradients with 0-valued baselines (authors' default)
@gsarti gsarti added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Nov 30, 2021
@gsarti gsarti added this to the v1.0 milestone Nov 30, 2021
@gsarti gsarti added the summary Summarizes multiple sub-tasks label Dec 1, 2021
@gsarti gsarti removed the good first issue Good for newcomers label Apr 8, 2022
@lsickert

This comment was marked as resolved.

@gsarti

This comment was marked as resolved.

@lsickert

This comment was marked as resolved.

@lsickert

This comment was marked as resolved.

@gsarti

This comment was marked as resolved.

@lsickert

This comment was marked as resolved.

@lsickert lsickert self-assigned this Oct 25, 2022
@lsickert

This comment was marked as resolved.

@gsarti

This comment was marked as resolved.

@lsickert

This comment was marked as resolved.

@gsarti

This comment was marked as resolved.

@lsickert

This comment was marked as resolved.

@gsarti

This comment was marked as resolved.

@lsickert

This comment was marked as resolved.

@gsarti

This comment was marked as resolved.

@lsickert lsickert linked a pull request Nov 26, 2022 that will close this issue
11 tasks
@lsickert

This comment was marked as resolved.

@gsarti gsarti removed a link to a pull request Jan 14, 2023
11 tasks
@gsarti gsarti pinned this issue Jan 24, 2023
@lsickert lsickert removed their assignment Feb 22, 2023
@gsarti gsarti removed this from the Demo Paper Release milestone May 8, 2023
@gsarti gsarti changed the title [Summary] Add attention-based feature attribution methods [Summary] Add internals-based feature attribution methods Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed summary Summarizes multiple sub-tasks
Projects
None yet
Development

No branches or pull requests

2 participants