Add custom attribution baseline #123

gsarti · 2022-03-07T14:35:58Z

🚀 Feature Request

Add an optional baselines field to the attribute method of AttributionModel. If not specified, baselines takes a default value of None and preserves the default behavior of using UNK tokens as a "no-information" baseline for attribution methods requiring one (e.g. integrated gradients, deeplift). The argument can take one of the following values:

str: The baseline is an alternative text. In this case, the text needs to be encoded and embedded inside FeatureAttribution.prepare to fill the baseline_ids and baseline_embeds fields of the Batch class. For now, only strings matching the original input length after tokenization are supported.
sequence(int): The baseline is a list of input ids. In this case, we embed the ids as described above. Again, the length must match the original input ids length.
torch.tensor: We would be interested in passing baseline embeddings explicitly, e.g. to allow for baselines not matching the original input shape that could be derived by averaging embeddings of different spans. In this case, the baseline embeddings field of Batch is populated directly (after checking that the shape is consistent with input embeddings) and the baseline ids field will be populated with some special id (e.g. -1) to mark that the ids were not provided. Important: This modality should raise a ValueError if used in combination with a layer method since layer methods that require a baseline use baseline ids explicitly as inputs for the forward_func used for attribution instead of baseline embeddings.
tuple of previous types: If we want to specify both source and target baselines when using attribute_target=True, the input will be a tuple of one of the previous types. The same procedure will be applied separately to define source and target baselines, except for the encoding that will require the tokenizer.as_target_tokenizer() context manager to encode strings.
list or tuple of lists of previous types: When multiple baselines are specified, we return the expected attribution score (i.e. average, assuming normality) by computing attributions for all available baselines and averaging the final results. See Section 2.2 of Erion et al. 2020 for more details.

🔈 Motivation

When working on minimal pairs, we might be interested in defining the contribution of specific words in the source or the target prefix not only in absolute terms by using a "no-information" baseline, but as the relative effect between the words composing the pair. Adding the possibility of using a custom baseline would enable this type of comparisons.

🛰 Notes

It will be important to validate whether the hooked method makes use of a baseline via the use_baseline attribute, raising a warning that the value of the custom input baseline would be ignored otherwise
Since baselines will support all input types (str, ids, embeds), it would be the right time to enable such support for the input of the attribute function. This could be achieved by an extra attribution_input field set to None by default that will substitute input_texts in the call to prepare_and_attribute, and get set to input_texts if not specified.

The text was updated successfully, but these errors were encountered:

gsarti added enhancement New feature or request good first issue Good for newcomers labels Mar 7, 2022

gsarti added this to the v1.0 milestone Mar 7, 2022

gsarti removed the good first issue Good for newcomers label Apr 8, 2022

gsarti added the good first issue Good for newcomers label Dec 13, 2022

gsarti modified the milestones: Demo Paper Release, v0.5 May 8, 2023

gsarti mentioned this issue Jun 2, 2023

GPT2 Integrated Gradients - empty input gives false results #190

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add custom attribution baseline #123

Add custom attribution baseline #123

gsarti commented Mar 7, 2022 •

edited

Add custom attribution baseline #123

Add custom attribution baseline #123

Comments

gsarti commented Mar 7, 2022 • edited

🚀 Feature Request

🔈 Motivation

🛰 Notes

gsarti commented Mar 7, 2022 •

edited