Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing question-answering with long contexts #13873

Merged
merged 5 commits into from Oct 5, 2021

Conversation

Narsil
Copy link
Contributor

@Narsil Narsil commented Oct 5, 2021

Fixes #13811

Fixes # (issue)

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@Narsil Narsil changed the title Fixing question-answering with long contexts # What does this PR do? Fixing question-answering with long contexts Oct 5, 2021
Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing, nice new tests!

@@ -330,26 +338,41 @@ def preprocess(self, example, padding="do_not_pad", doc_stride=128, max_question
qas_id=None,
)
)
return {"features": features, "example": example}

shallow = []
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why shallow for the name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be a better name, I just tried to disassemble the feature before passing it on for _forward as it makes it benefit automatically from no_grad and to(device) from forward (otherwise _forward had to take care of it).

Reusing features is good enough? Didn't find a better name.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like features personally (and I use example/sample when I want to make the distinction: one sample gives several features).

@Narsil Narsil merged commit 0ddadbf into huggingface:master Oct 5, 2021
@Narsil Narsil deleted the fix_bc_question branch October 5, 2021 14:09
stas00 pushed a commit to stas00/transformers that referenced this pull request Oct 12, 2021
* Tmp.

* Fixing BC for question answering with long context.

* Capping model_max_length to avoid tf overflow.

* Bad workaround bugged roberta.

* Fixing name.
lapisfluvialis pushed a commit to lapisfluvialis/transformers that referenced this pull request Oct 27, 2021
* Tmp.

* Fixing BC for question answering with long context.

* Capping model_max_length to avoid tf overflow.

* Bad workaround bugged roberta.

* Fixing name.
Albertobegue pushed a commit to Albertobegue/transformers that referenced this pull request Jan 27, 2022
* Tmp.

* Fixing BC for question answering with long context.

* Capping model_max_length to avoid tf overflow.

* Bad workaround bugged roberta.

* Fixing name.
@vikramtharakan
Copy link

Having a similar issue where I used a QA model from transformers, tweaked the model in label studio making some annotations and then tried to load the model back again. The model is pulling out the correct answers but seemingly hadnle_impossible_answers isn't working because it gives an answer for every questions even when the question is irrelevant.. What's even weirder is that it doesn't do this in label studio's interface so seemingly this handle_impossible_answers is working on that side. My contexts I pass through the pipeline are also quite long

Made sure I have the most up to date transformers model, and the models were saved out with save_pretrained

model_path = "PATH"
model = AutoModelForQuestionAnswering.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path) 
QA = pipeline('question-answering', model=model, tokenizer=tokenizer)

# Other code...
answers = QA(question=questions, context=context, top_k=1, max_answer_len=32, handle_impossible_answer=True)

The answers that I get out of here have an answer for every question, even when it doesn't need to be answered. I've tried loading the models multiple ways - the only way I got handle_impossible_answers to work as intended is if the tokenizer wasn't the AutoTokenizer that I had used here. But then the answers it gave me were complete garbage. Anybody else run into this issue with AutoTokenizer??

@Narsil
Copy link
Contributor Author

Narsil commented Feb 2, 2022

The only was I can think the tokenizer could be involved would be if it doesn't include EOS/BOS.

For null answer to come out, the score has to be the highest when the start and end logits are the first one (which should be a BOS/CLS token). Did you finetune your model on such null answers ? Maybe the finetuning just made that output impossible for the model to produce ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AttributeError when running question answering models for result
3 participants