Basic Attention attribution #148

lsickert · 2022-11-23T21:56:55Z

Description

This PR adds the base-class for attribution methods based on attention as well as two basic attention attribution methods (aggregated attention and last-layer attention).

It also includes a small fix regarding the rounding of outputs in the cli tableview

It reverts the previous upgrade of pytorch to ^1.13.0 because of an issue with installing the dependency on certain platforms such as OSX (see related issues in pytorch: issue1, issue2

Related Issue

108

Type of Change

📚 Examples / docs / tutorials / dependencies update
🔧 Bug fix (non-breaking change which fixes an issue)
🥂 Improvement (non-breaking change which improves an existing feature)
🚀 New feature (non-breaking change which adds functionality)
💥 Breaking change (fix or feature that would cause existing functionality to change)
🔐 Security fix

Checklist

I've read the CODE_OF_CONDUCT.md document.
I've read the CONTRIBUTING.md guide.
I've updated the code style using make codestyle.
I've written tests for all new methods and classes that I created.
I've written the docstring in Google format for all the methods and classes that I used.

…o model initialization

…bution function now)

… torch to ~1.12.1 again because of platform issue: pytorch/pytorch#88826

github-actions

Hello @lsickert, thank you for submitting a PR! We will respond as soon as possible.

pyproject.toml

gsarti · 2022-12-05T09:16:25Z

Note: it's good to have the summary issue linked here, but we don't want to close it just yet! :)

lsickert · 2023-01-02T19:00:28Z

@gsarti I was working on the decoder-only models now, and came across some inconsistencies across the different models. For example both GPT and Transformer XL will only include attentions as parameter in their forward-pass output, whereas GPT2 will include both attentions as well as cross_attentions. For now I only use the attentions parameter to generate the attributions since it is present in all decoer-only models, but I am not sure if we should also use the cross-attentions in the models where they are present as well.

gsarti · 2023-01-02T19:04:57Z

Hi @lsickert, good question! The cross-attentions are defined for GPT2 and other decoder-only to support their usage as components of an encoder-decoder as part of the EncoderDecoderModel abstraction in 🤗 transformers. If the model is loaded as a decoder-only, it should only have regular self-attention, so you can assume these are the only one we are interested in for that case!

lsickert · 2023-01-02T19:13:42Z

Hi @lsickert, good question! The cross-attentions are defined for GPT2 and other decoder-only to support their usage as components of an encoder-decoder as part of the EncoderDecoderModel abstraction in 🤗 transformers. If the model is loaded as a decoder-only, it should only have regular self-attention, so you can assume these are the only one we are interested in for that case!

Ah perfect. Yes I assumed something like that but was not entirely sure. I think then the decoder-only support is now done for the basic attention functions. I will still need to write tests tomorrow and finish up the docstrings and other small things, but apart from that I think the branch is ready for merging.

inseq/attr/feat/attention_attribution.py

inseq/attr/feat/ops/basic_attention.py

inseq/attr/feat/attention_attribution.py

inseq/attr/feat/ops/basic_attention.py

gsarti · 2023-01-03T15:24:59Z

Also, some usage issues I identified:

Using last_layer_attention with encoder-decoder models produces the following error, which does not occur for aggregated_attention:

out = model.attribute("The cafeteria had 23 apples. They used 20 for lunch. How many apples do they have left?")

RuntimeError: stack expects each tensor to be equal size, but got [1] at entry 0 and [1, 21] at entry 1. This looks to me like the error you were getting when you just started working on the attention attribution methods.

Running attribution with decoder-only models with any attention method produces the following error:

/usr/local/lib/python3.8/dist-packages/inseq/data/attribution.py in <listcomp>(.0)
    115         sources = None
    116         if attr.source_attributions is not None:
--> 117             sources = [drop_padding(attr.source[seq_id], pad_id) for seq_id in range(num_sequences)]
    118         targets = [
    119             drop_padding([a.target[seq_id][0] for a in attributions], pad_id) for seq_id in range(num_sequences)

TypeError: 'NoneType' object is not subscriptable

is it possible that you are not setting source attributions to None in the decoder-only case?

If I use facebook/wmt19-en-de for a translation with aggregated_attention (which works for other enc-dec) I get forward() missing 1 required positional argument: 'input_ids'. I believe this is due to a problem with _extract_forward_pass_args which should be solved anyways when we drop the unnecessary method to conform to the approach used for step scores (see review above)

lsickert · 2023-01-03T21:09:17Z

Also, some usage issues I identified:

1. Using `last_layer_attention` with encoder-decoder models produces the following error, which does not occur for `aggregated_attention`:

out = model.attribute("The cafeteria had 23 apples. They used 20 for lunch. How many apples do they have left?")

RuntimeError: stack expects each tensor to be equal size, but got [1] at entry 0 and [1, 21] at entry 1. This looks to me like the error you were getting when you just started working on the attention attribution methods.

2. Running attribution with decoder-only models with any attention method produces the following error:

/usr/local/lib/python3.8/dist-packages/inseq/data/attribution.py in <listcomp>(.0)
    115         sources = None
    116         if attr.source_attributions is not None:
--> 117             sources = [drop_padding(attr.source[seq_id], pad_id) for seq_id in range(num_sequences)]
    118         targets = [
    119             drop_padding([a.target[seq_id][0] for a in attributions], pad_id) for seq_id in range(num_sequences)

TypeError: 'NoneType' object is not subscriptable

is it possible that you are not setting source attributions to None in the decoder-only case?

3. If I use `facebook/wmt19-en-de` for a translation with `aggregated_attention` (which works for other enc-dec) I get `forward() missing 1 required positional argument: 'input_ids'`. I believe this is due to a problem with `_extract_forward_pass_args` which should be solved anyways when we drop the unnecessary method to conform to the approach used for step scores (see review above)

Yes, the second point was my bad and should be fixed already. I did not notice that the changes to attention_attribution.py were not yet staged. For the other points I will take a look.

…tion call

…ttribution

lsickert · 2023-01-04T17:35:11Z

The first error is fixed by making sure the dimensional size of the tensors stays the same for all tokens. Very interesting behavior, though, since I remember, I specifically had to put in the torch.squeeze operation for the last_layer_attention to work. Maybe something changed in torch 1.13, which made this unnecessary now.

… range of layers to be specified for averaging in AggregatedAttention

lsickert · 2023-01-09T20:39:03Z

@gsarti I think all open points should be addressed now. Please feel free to test already while I clean up a bit and work on updating and creating the docstrings tomorrow.

gsarti · 2023-01-10T22:30:45Z

Thank you for the update! After giving it some more thought, I decided to opt for a single centralized class for basic attention attribution. The decision was mainly driven to avoid confusion on which class to use, and aimed at enabling more flexibility with the choice of heads and layers for aggregation.

The Attention method found at the last commit improves upon the previous classes by enabling the choice of a single element (single int), a range (with (start_idx, end_idx)) or a set of custom valid indices (as [idx_1, idx_2, ...]) for both attention heads and model layers. Moreover, the aggregation procedure has been centralized, and the definition of custom user-defined aggregation functions beyond the default ones has been enabled.

Example of default usage:

import inseq

model = inseq.load_model("facebook/wmt19-en-de", "attention")
out = model.attribute("The developer argued with the designer because her idea cannot be implemented.")

The default behavior is set to minimize unnecessary parameter definitions. In the default case above, the result is the average across all attention heads of the final layer. Here's a more complex usage:

import inseq

model = inseq.load_model("facebook/wmt19-en-de", "attention")
out = model.attribute(
	"The developer argued with the designer because her idea cannot be implemented.",
	layers=(0, 5),
	heads=[0, 2, 5, 7],
	aggregate_heads_fn = "max"
)

In the case above, the outcome is a matrix of maximum attention weights of heads 0, 2, 5 and 7 after averaging their weights across the first 5 layers of the model.

Remaining todos:

Document AttentionAttribution more comprehensively in the docs and docstrings.
Add tests for AttentionAttribution, minimally for one enc-dec and one dec-only model, testing multiple aggregation strategies if possible.

inseq/attr/feat/attention_attribution.py

gsarti · 2023-01-14T20:03:57Z

Added some tests for attention attribution, fixed the typing issue of FullAttentionOutput (we were passing it as a tuple to _aggregate_layers but we were calling torch.stack before that, moved the latter inside the function) and added some further checks in the aggregation, we should be good for the merge now!

lsickert · 2023-01-15T17:34:33Z

@gsarti I think we were working on the same remaining points right now.

I got a bit confused about the torch.stack outside of the aggregation function, which is why I edited the typing wrongly, but I would also say that it is better to call it inside the function and just pass the raw output from the model in there.

I am currently still finishing on a test for those aggregation functions specifically (outside of the normal pipeline), but then I would also agree that we are good to go.

gsarti · 2023-01-16T13:30:51Z

@lsickert feel free to merge as soon as CI is passing! 🎉

lsickert added 9 commits September 23, 2022 13:04

added jupyterlab dependency (for easier testing)

b9ccbf2

initial commit attention methods, added output_attentions parameter t…

08d5dbd

…o model initialization

Merge branch 'main' into attention-attribution

57ae54a

added basic attention method stubs\n added attention method registry

544321c

reverted changes to output generation (forward pass done inside attri…

d0e859f

…bution function now)

first working version of basic attention methods

a2a2021

fixed rounding of values in cli output

7b3c4fd

added documentation to most methods and generalized functions

13fc9f3

Merge branch 'main' into attention-attribution\n\nNeeded to downgrade…

bb13036

… torch to ~1.12.1 again because of platform issue: pytorch/pytorch#88826

github-actions bot reviewed Nov 23, 2022

View reviewed changes

lsickert changed the title ~~Basic Attention attribution~~ [WIP] Basic Attention attribution Nov 23, 2022

lsickert added 2 commits November 24, 2022 22:34

removed python 3.11 build target

9340343

fix safety warnings

3cfd706

gsarti reviewed Nov 25, 2022

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

lsickert added 2 commits November 25, 2022 17:40

set correct python version in pyproject.toml

2765d63

regenerated requirements without 3.11

4dd442f

lsickert linked an issue Nov 26, 2022 that may be closed by this pull request

[Summary] Add internals-based feature attribution methods #108

Open

lsickert added 2 commits December 9, 2022 22:37

Merge branch 'main' into attention-attribution, quick fix for mps issue

b14407c

Merge branch 'main' into attention-attribution

6a72166

lsickert added the enhancement New feature or request label Dec 12, 2022

lsickert added 4 commits January 2, 2023 14:56

merge branch 'main' into attention-attribution

6535b09

update deps after merge

624435e

include 3.11 as build target

06f89a8

fix different attribution_step argument formatting

7bcbe92

added basic decoder-only support

b2fc73c

gsarti requested changes Jan 3, 2023

View reviewed changes

fixed output error for decoder only models

b044b4c

lsickert added 3 commits January 3, 2023 22:15

removed unnecessary convergence delta references in attention attribu…

8c344b7

…tion call

allow negative indices when selecting a specific attention head for a…

f51cf25

…ttribution

added missing negation to head checking

c6a9e70

lsickert added 4 commits January 4, 2023 18:35

fixed last_layer_attention attribution

6c9cfae

use custom format_attribute_args function for attention methods

b78bcc1

always use decoder_input_embeds in forward output

d27f1c3

reworked LastLayerAttention to work with any single layer and allow a…

cacaa31

… range of layers to be specified for averaging in AggregatedAttention

gsarti added 2 commits January 10, 2023 22:56

Minor bugfixes and version bumps

a8d5264

Generalized attention attribution

966f63c

gsarti mentioned this pull request Jan 12, 2023

ruff stylechecking #159

Merged

lsickert commented Jan 13, 2023

View reviewed changes

inseq/attr/feat/attention_attribution.py Outdated Show resolved Hide resolved

updated documentation and added 'min' aggregation function

1301a02

gsarti removed a link to an issue Jan 14, 2023

[Summary] Add internals-based feature attribution methods #108

Open

gsarti added 2 commits January 14, 2023 17:12

Tests, typing fix, additional checks

914ee8f

Fix style

7c825ad

lsickert and others added 2 commits January 15, 2023 19:33

added tests for attention utils

f6f0a64

classmethod -> staticmethod where possible

f40f63b

gsarti changed the title ~~[WIP] Basic Attention attribution~~ Basic Attention attribution Jan 16, 2023

gsarti approved these changes Jan 16, 2023

View reviewed changes

lsickert merged commit 7ed9d79 into main Jan 16, 2023

lsickert deleted the attention-attribution branch January 16, 2023 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Basic Attention attribution #148

Basic Attention attribution #148

lsickert commented Nov 23, 2022 •

edited

github-actions bot left a comment

gsarti commented Dec 5, 2022

lsickert commented Jan 2, 2023

gsarti commented Jan 2, 2023

lsickert commented Jan 2, 2023

gsarti commented Jan 3, 2023

lsickert commented Jan 3, 2023 •

edited

lsickert commented Jan 4, 2023

lsickert commented Jan 9, 2023

gsarti commented Jan 10, 2023 •

edited by lsickert

gsarti commented Jan 14, 2023

lsickert commented Jan 15, 2023

gsarti commented Jan 16, 2023

Basic Attention attribution #148

Basic Attention attribution #148

Conversation

lsickert commented Nov 23, 2022 • edited

Description

Related Issue

Type of Change

Checklist

github-actions bot left a comment

Choose a reason for hiding this comment

gsarti commented Dec 5, 2022

lsickert commented Jan 2, 2023

gsarti commented Jan 2, 2023

lsickert commented Jan 2, 2023

gsarti commented Jan 3, 2023

lsickert commented Jan 3, 2023 • edited

lsickert commented Jan 4, 2023

lsickert commented Jan 9, 2023

gsarti commented Jan 10, 2023 • edited by lsickert

gsarti commented Jan 14, 2023

lsickert commented Jan 15, 2023

gsarti commented Jan 16, 2023

lsickert commented Nov 23, 2022 •

edited

lsickert commented Jan 3, 2023 •

edited

gsarti commented Jan 10, 2023 •

edited by lsickert