fix loading from pretrained for sharded model with `torch_dtype="auto" #18061

NouamaneTazi · 2022-07-07T17:15:13Z

Fixes the following script which failed because resolved_archive_file is a list for sharded models and load_state_dictexpects a path to a single file

model = AutoModelForCausalLM.from_pretrained("bigscience/bloom", torch_dtype="auto")

HuggingFaceDocBuilderDev · 2022-07-07T17:27:25Z

The documentation is not available anymore as the PR was closed or merged.

LysandreJik · 2022-07-11T10:29:36Z

Hey @NouamaneTazi, do you have a code example that failed before and that doesn't fail anymore with your PR?

NouamaneTazi · 2022-07-11T16:14:19Z

Yes @LysandreJik, the script I provided did fail for me when I tried it:

model = AutoModelForCausalLM.from_pretrained("bigscience/bloom", torch_dtype="auto") # this should fail for any sharded models

The issue was that load_state_dict expects a str or a Pathlike while resolved_archive_file is a list for sharded models.

LysandreJik · 2022-07-12T09:22:38Z

Understood! It's a bit hard to play with such a large model, so I'm reproducing with lysandre/test-bert-sharded. However, it seems that it doesn't entirely fix the issue:

>>> from transformers import AutoModelForCausalLM
>>> model = AutoModelForCausalLM.from_pretrained("lysandre/test-bert-sharded", torch_dtype="auto")

File ~/Workspaces/Python/transformers/src/transformers/models/auto/auto_factory.py:446, in _BaseAutoModelClass.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
    444 elif type(config) in cls._model_mapping.keys():
    445     model_class = _get_model_class(config, cls._model_mapping)
--> 446     return model_class.from_pretrained(pretrained_model_name_or_path, *model_args, config=config, **kwargs)
    447 raise ValueError(
    448     f"Unrecognized configuration class {config.__class__} for this kind of AutoModel: {cls.__name__}.\n"
    449     f"Model type should be one of {', '.join(c.__name__ for c in cls._model_mapping.keys())}."
    450 )

File ~/Workspaces/Python/transformers/src/transformers/modeling_utils.py:2040, in PreTrainedModel.from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs)
   2038     torch_dtype = get_state_dict_dtype(state_dict)
   2039 else:
-> 2040     one_state_dict = load_state_dict(resolved_archive_file)
   2041     torch_dtype = get_state_dict_dtype(one_state_dict)
   2042     del one_state_dict  # free CPU memory

File ~/Workspaces/Python/transformers/src/transformers/modeling_utils.py:359, in load_state_dict(checkpoint_file)
    357 except Exception as e:
    358     try:
--> 359         with open(checkpoint_file) as f:
    360             if f.read().startswith("version"):
    361                 raise OSError(
    362                     "You seem to have cloned a repository without having git-lfs installed. Please install "
    363                     "git-lfs and run `git lfs install` followed by `git lfs pull` in the folder "
    364                     "you cloned."
    365                 )

TypeError: expected str, bytes or os.PathLike object, not list

NouamaneTazi · 2022-07-12T09:57:44Z

This is exactly the error I got before the fix. And from your traceback it seems that the patch wasn't applied
You have

-> 2040     one_state_dict = load_state_dict(resolved_archive_file)

When it should be

-> 2040     one_state_dict = load_state_dict(resolved_archive_file[0])

From my side, testing with the patch did succeed in loading your model

LysandreJik · 2022-07-12T10:11:13Z

Ah, great catch; I had too many patch-1 branches locally. Your patch seems to work, pinging @sgugger for additional verification.

sgugger

Yes, this works. Note that this is not recommended in terms of speed as you load a shard and then discard it immediately, so it's more efficient to just the torch dtype to the value you want.

NouamaneTazi · 2022-07-12T12:12:30Z

Should I raise a warning when this method is used @sgugger?

sgugger · 2022-07-12T12:16:12Z

I don't think Stas will like the extra warning, so I'd say no ;-)

#18061)

fix loading from pretrained for sharded model with `torch_dtype="auto"

66116d7

LysandreJik approved these changes Jul 12, 2022

View reviewed changes

sgugger approved these changes Jul 12, 2022

View reviewed changes

sgugger mentioned this pull request Jul 27, 2022

[bugfix] Loading sharded model with torch_dtype='auto' causes TypeError #18315

Closed

sgugger merged commit 83d2d74 into huggingface:main Jul 27, 2022

LysandreJik pushed a commit that referenced this pull request Jul 27, 2022

fix loading from pretrained for sharded model with `torch_dtype="auto" (

9e564d0

#18061)

koreyou mentioned this pull request Jul 27, 2022

Loading sharded model with torch_dtype='auto' causes TypeError #18314

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix loading from pretrained for sharded model with `torch_dtype="auto" #18061

fix loading from pretrained for sharded model with `torch_dtype="auto" #18061

NouamaneTazi commented Jul 7, 2022

HuggingFaceDocBuilderDev commented Jul 7, 2022 •

edited

Loading

LysandreJik commented Jul 11, 2022 •

edited

Loading

NouamaneTazi commented Jul 11, 2022 •

edited

Loading

LysandreJik commented Jul 12, 2022 •

edited

Loading

NouamaneTazi commented Jul 12, 2022

LysandreJik commented Jul 12, 2022

sgugger left a comment

NouamaneTazi commented Jul 12, 2022

sgugger commented Jul 12, 2022

fix loading from pretrained for sharded model with `torch_dtype="auto" #18061

fix loading from pretrained for sharded model with `torch_dtype="auto" #18061

Conversation

NouamaneTazi commented Jul 7, 2022

HuggingFaceDocBuilderDev commented Jul 7, 2022 • edited Loading

LysandreJik commented Jul 11, 2022 • edited Loading

NouamaneTazi commented Jul 11, 2022 • edited Loading

LysandreJik commented Jul 12, 2022 • edited Loading

NouamaneTazi commented Jul 12, 2022

LysandreJik commented Jul 12, 2022

sgugger left a comment

Choose a reason for hiding this comment

NouamaneTazi commented Jul 12, 2022

sgugger commented Jul 12, 2022

HuggingFaceDocBuilderDev commented Jul 7, 2022 •

edited

Loading

LysandreJik commented Jul 11, 2022 •

edited

Loading

NouamaneTazi commented Jul 11, 2022 •

edited

Loading

LysandreJik commented Jul 12, 2022 •

edited

Loading