[Summary] Add internals-based feature attribution methods #108

gsarti · 2021-11-30T19:41:12Z

🚀 Feature Request

The following is a non-exhaustive list of attention-based feature attribution methods that could be added to the library:

Method name	Source	Code implementation	Status
Last-Layer Attention	Jain and Wallace '19	`successar/AttentionExplanation`	✅
Aggregated Attention	Jain and Wallace '19	`successar/AttentionExplanation`	✅
Attention Flow	Abnar and Zuidema '20	`samiraabnar/attention_flow`
Attention Rollout	Abnar and Zuidema '20	`samiraabnar/attention_flow`
Attention with Values Norm (Attn-N)	Kobayashi et al '20	`gorokoba560/norm-analysis-of-transformer`
Attention with Residual Norm (AttnRes-N)	Kobayashi et al '20	`gorokoba560/norm-analysis-of-transformer`
Attention with Attention Block Norm (AttnResLn-N or LnAttnRes-N)	Kobayashi et al '21	`gorokoba560/norm-analysis-of-transformer`
Attention-driven Relevance Propagation	Chefer et al. '21	`hila-chefer/Transformer-MM-Explainability`
ALTI+	Ferrando et al '22	`mt-upc/transformer-contributions-nmt`
GlobEnc	Modarressi et al. '22	`mohsenfayyaz/globenc`
Attention with Attention Block + FFN Norm (AttnResLnFF-N or LnAttnResFF-N)	Kobayashi et al '23	-
Attention x Transformer Block Norm	Kobayashi et al '23	-
Logit	Ferrando et al '23	`mt-upc/logit-explanations`
ALTI-Logit	Ferrando et al '23	`mt-upc/logit-explanations`
DecompX	Modarressi et al '23	`mohsenfayyaz/DecompX`

Notes:

Add the possibility to scale attention weights by the norm of value vectors, shown to be effective for alignment and encoder models (Ferrando and Costa-jussà '21, Treviso et al. '21)
The ALTI+ technique is an extension of the ALTI method by Ferrando et al. '22 (paper, code) to Encoder-Decoder architectures. It was recently used by the Facebook team to detect hallucinated toxicity by highlighting toxic keywords paying attention to the source (NLLB paper, Figure 31).
Attention Flow is very computationally expensive to compute but has proven SHAP guarantees for same-layer attribution, which is not the case for Rollout or other methods. Flow and rollout should be propagation methods rather than stand-alone approaches since they are used for most attention-based attributions.
GlobEnc corresponds roughly to Attention x Transformer Block Norm but ignores the FFN part, that in the latter is incorporated by a localized application of Integrated Gradients with 0-valued baselines (authors' default)

The text was updated successfully, but these errors were encountered:

gsarti added enhancement New feature or request help wanted Extra attention is needed good first issue Good for newcomers labels Nov 30, 2021

gsarti added this to the v1.0 milestone Nov 30, 2021

gsarti added the summary Summarizes multiple sub-tasks label Dec 1, 2021

gsarti removed the good first issue Good for newcomers label Apr 8, 2022

This comment was marked as resolved.

Sign in to view

lsickert self-assigned this Oct 25, 2022

This comment was marked as resolved.

Sign in to view

lsickert mentioned this issue Nov 23, 2022

Basic Attention attribution #148

Merged

11 tasks

lsickert linked a pull request Nov 26, 2022 that will close this issue

Basic Attention attribution #148

Merged

11 tasks

This comment was marked as resolved.

Sign in to view

gsarti removed a link to a pull request Jan 14, 2023

Basic Attention attribution #148

Merged

11 tasks

gsarti pinned this issue Jan 24, 2023

lsickert removed their assignment Feb 22, 2023

gsarti removed this from the Demo Paper Release milestone May 8, 2023

gsarti changed the title ~~[Summary] Add attention-based feature attribution methods~~ [Summary] Add internals-based feature attribution methods Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Summary] Add internals-based feature attribution methods #108

[Summary] Add internals-based feature attribution methods #108

gsarti commented Nov 30, 2021 •

edited

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

[Summary] Add internals-based feature attribution methods #108

[Summary] Add internals-based feature attribution methods #108

Comments

gsarti commented Nov 30, 2021 • edited

🚀 Feature Request

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

This comment was marked as resolved.

gsarti commented Nov 30, 2021 •

edited