New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DPR AutoModel loading incorrect architecture for DPRContextEncoders #13670
Comments
Unfortunately, |
Ok, that kind of makes sense 🙃 Is there an easy way to change that or |
To the best of my knowledge, this would be a major change of auto factory because the mapping file defines all |
@joshdevins - could you check whether the PR linked above solves the issue? |
@patrickvonplaten Sorry, I realise now that there are two problems. Your PR fixes the problem that they didn't implement import torch
import transformers
model_id = "facebook/dpr-ctx_encoder-single-nq-base"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_id)
input_ids = tokenizer("This is an example sentence.", return_tensors="pt")["input_ids"]
auto_model = transformers.AutoModel.from_pretrained(model_id)
context_model = transformers.DPRContextEncoder.from_pretrained(model_id)
auto_output = auto_model(input_ids)
context_output = context_model(input_ids) > type(auto_model)
transformers.models.dpr.modeling_dpr.DPRQuestionEncoder
> type(context_model)
transformers.models.dpr.modeling_dpr.DPRContextEncoder
> torch.all(torch.eq(auto_output["pooler_output"], context_output["pooler_output"]))
tensor(False) |
Note that my workaround is basically this 🤷 config = AutoConfig.from_pretrained(model_id)
getattr(transformers, config.architectures[0]).from_pretrained(model_id) |
@joshdevins - ah yeah I think we can't really do anything against the second problem the way it is implemented now...maybe it might makes sense to implement a |
I guess that makes sense. I wonder if this is the only model that has this scenario? It seems the way |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
In preparation for an 8.0 release, this updates PyTorch NLP dependencies to more recent and latest minor versions. Amongst other things, this introduces a fix from transformers that is helpful for text embedding tasks with certain DPR models. See: huggingface/transformers#13670
In preparation for an 8.0 release, this updates PyTorch NLP dependencies to more recent and latest minor versions. Amongst other things, this introduces a fix from transformers that is helpful for text embedding tasks with certain DPR models. See: huggingface/transformers#13670
In preparation for an 8.0 release, this updates PyTorch NLP dependencies to more recent and latest minor versions. Amongst other things, this introduces a fix from transformers that is helpful for text embedding tasks with certain DPR models. See: huggingface/transformers#13670 Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>
In preparation for an 8.0 release, this updates PyTorch NLP dependencies to more recent and latest minor versions. Amongst other things, this introduces a fix from transformers that is helpful for text embedding tasks with certain DPR models. See: huggingface/transformers#13670 Co-authored-by: Seth Michael Larson <seth.larson@elastic.co>
Environment info
transformers
version: 4.10.2Who can help
Model type
dpr
: @LysandreJik @patrickvonplaten @lhoestqInformation
Model I am using:
To reproduce
Loading a DPR context encoder
DPRContextEncoder
usingAutoModel.from_pretrained
is actually loadingDPRQuestionEncoder
instead, and later fails.Steps to reproduce the behavior:
AutoModel.from_pretrained('facebook/dpr-ctx_encoder-single-nq-base')
Note in the above that it's trying to use the
DPRQuestionEncoder
even though the config for this context encoder is correct and points toarchitecture=DPRContextEncoder
.Using explicitly the
DPRContextEncoder.from_pretrained
works just fine, so it looks like this is somewhere inAutoModel
.DPRContextEncoder.from_pretrained('facebook/dpr-ctx_encoder-single-nq-base')
Expected behavior
Using
AutoModel.from_pretrained
should pick the correct architecture for aDPRContextEncoder
.The text was updated successfully, but these errors were encountered: