Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add fastai upstream and downstream capacities for fastai>=2.4 and fastcore>=1.3.27 versions #678

Merged
merged 87 commits into from Apr 26, 2022
Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
87 commits
Select commit Hold shift + click to select a range
f237986
added a fastai mixin for upstream and downstream tasks with fastai l…
omarespejel Oct 17, 2021
16a10a2
Added check for right fastai version
omarespejel Oct 17, 2021
a0398d9
Added aditional documentation
omarespejel Oct 17, 2021
0e65896
Add support for fastai>=2.4 versions and fastcore>=1.3.27
omarespejel Feb 10, 2022
aea8b8e
Merge branch 'main' into add-fastai-mixin
omarespejel Feb 10, 2022
940c68d
Update docstrings, save_fastai_learner, push_to_hub_fastai, and chang…
omarespejel Feb 11, 2022
4b246f4
Merge branch 'add-fastai-mixin' of https://github.com/omarespejel/hug…
omarespejel Feb 11, 2022
807e7bb
trim triling whitespaces
omarespejel Feb 11, 2022
44493cf
Import fastai_utils.py in huggingface_hub/__init__py
omarespejel Feb 13, 2022
a7253ed
Add check_fastai_fastcore_versions function in fastai_utils.py
omarespejel Feb 13, 2022
5904986
Eliminate imports of libraries not used in fastai_utils.py
omarespejel Feb 13, 2022
18b4498
Add isort and black format to fastai_utils.py
omarespejel Feb 13, 2022
37b6b51
Change pickle_protocol argument from kwargs to explicit
omarespejel Feb 15, 2022
1c9cff5
Change kwargs arguments in the function from_pretrained_fastai to exp…
omarespejel Feb 15, 2022
2a40a8d
Simplify push_to_hub_fastai function, particularly the repo_id argument
omarespejel Feb 15, 2022
1a4012c
Eliminate search for a pickle document in from_pretrained_fastai func…
omarespejel Feb 16, 2022
a9ba5d2
Simplify push_to_hub_fastai and correct bug in from_pretrained_fastai
omarespejel Feb 16, 2022
68f2006
Add pickle.DEFAULT_PROTOCOL for get adequate protocol when exporting …
omarespejel Feb 17, 2022
97742cb
Allow to load only models in the Hub when using from_pretrained_fastai
omarespejel Feb 17, 2022
7ba1fdd
Eliminate cache_dir from from_pretrained_fastai for simplification
omarespejel Feb 17, 2022
bf7a033
Correct nit picks in push_to_hub_fastai
omarespejel Feb 17, 2022
d3da12a
Update with nits
omarespejel Feb 17, 2022
3a66991
Apply isort
omarespejel Feb 17, 2022
679187c
Replace config.json for pyproject.toml to check for fastai and fastco…
omarespejel Feb 21, 2022
24d3ddb
Isort imports
omarespejel Feb 21, 2022
b0eeb35
Make pyproject.toml automatically filled with fastai, fastcore, and p…
omarespejel Feb 22, 2022
2fb9400
add check_fastai_fastcore_pyproject_versions function to know the fas…
omarespejel Feb 24, 2022
c443bb8
Change library tomlkit for toml
omarespejel Feb 24, 2022
42b01c3
Add extras[fastai] with the toml library
omarespejel Feb 25, 2022
fb886e9
Change the way the token is asked in def push_to_hub_fastai(
omarespejel Feb 25, 2022
effac9f
Eliminate logger from imports
omarespejel Feb 25, 2022
db1cbe9
Fix nits
omarespejel Feb 25, 2022
c9ccff0
Remove typing.Union from imports
omarespejel Feb 25, 2022
19fad26
add fastai integration tests
omarespejel Feb 28, 2022
f72ac27
Import fastai in setup.py
omarespejel Mar 1, 2022
fd63e5a
Import toml inside check_fastai_fastcore_pyproject_versions function
omarespejel Mar 1, 2022
1435d2c
Merge branch 'main' of https://github.com/huggingface/huggingface_hub…
omarespejel Mar 1, 2022
2ee4c56
Merge branch 'huggingface:main' into add-fastai-mixin
omarespejel Mar 1, 2022
bd1c675
Merge branch 'add-fastai-mixin' of https://github.com/omarespejel/hug…
omarespejel Mar 1, 2022
2c38e7b
Nits in fastai_utils.py
omarespejel Mar 1, 2022
4013c79
Add build_fastai to python-tests.yml
omarespejel Mar 1, 2022
5e840b9
Add fastcore import to setup.py
omarespejel Mar 1, 2022
33061cb
Add require_fastai_fastcore() to skip tests
omarespejel Mar 2, 2022
6680065
Nits and documentation of raised errors improved
omarespejel Mar 4, 2022
8ce5894
Merge main branch changes
Mar 22, 2022
988d6cd
add strategy for fastai in python-tests.yml
Mar 22, 2022
d3585a3
Eliminate organization from push_to_hub_fastai
Mar 23, 2022
5699b43
Add python 3.7-3.10 to python-tests.yml
omarespejel Mar 31, 2022
8041b60
Resolve conflicts
omarespejel Mar 31, 2022
8587464
Change python version in tests to 3.9
omarespejel Mar 31, 2022
421aab3
Fix tests
omarespejel Apr 7, 2022
b2e7fa7
Fix conflict in config.py due to order of tf packages
omarespejel Apr 7, 2022
264537b
Merge branch 'main' into add-fastai-mixin
omarespejel Apr 7, 2022
a8e32a3
Fix delete_repo function in test_fastai_integration
omarespejel Apr 7, 2022
b061df8
Merge branch 'add-fastai-mixin' of https://github.com/omarespejel/hug…
omarespejel Apr 7, 2022
f297f7c
Isort test_fastai_integration
omarespejel Apr 7, 2022
de7368b
Replace the repo_id name for model_id
omarespejel Apr 11, 2022
3cfe4d5
Make fastai and fastcore versions flexible in setup.py
omarespejel Apr 11, 2022
050904a
Confirm fastai supports python 3.10 in python-tests-yml
omarespejel Apr 12, 2022
f3b2000
Fix docs in fastai_utils.py
omarespejel Apr 12, 2022
2d12300
Change the name of DummyModel for dummy_model
omarespejel Apr 12, 2022
edeee68
Handle pickling errors when exporting a fastai.Learner
omarespejel Apr 12, 2022
028c80a
Change name of internal functions in fastai_utils.py
omarespejel Apr 12, 2022
1bc2c41
black style to fastai_utils.py
omarespejel Apr 12, 2022
26c8d15
Eliminate unnecessary comments from fastai_utils.py
omarespejel Apr 12, 2022
a7de8c4
Add capacity to load a local fastai.Learner to from_pretrained_keras
omarespejel Apr 12, 2022
022c571
black format fastai_utils.py
omarespejel Apr 12, 2022
5dafb25
Come back to Python 3.9 instead of 3.10
omarespejel Apr 13, 2022
9a06a67
Change name name of save_fastai_learner to _save_pretrained_fastai in…
omarespejel Apr 13, 2022
004d318
Change the name to _save_pretrained_fastai in __init__.py
omarespejel Apr 13, 2022
947d58b
Fix nits in test_fastai_integration.py
omarespejel Apr 13, 2022
d6ee8bf
Add fastai integration to docs
omarespejel Apr 13, 2022
18d0d27
Fix nits
omarespejel Apr 21, 2022
7e63f46
Fix wording
omarespejel Apr 21, 2022
2512ead
Allow _save_pretrained_fastai to directly export the model in save_di…
omarespejel Apr 21, 2022
8633c0e
black fastai_utils.py
omarespejel Apr 21, 2022
363cb81
Make the requirement of having a pyproject.toml optional
omarespejel Apr 21, 2022
ac0ef8a
Add warnings if the pyproject.toml does not contain a "build-system" …
omarespejel Apr 25, 2022
2b28f4b
Change try-excepts for ifs in the warnings checking the fastai and fa…
omarespejel Apr 25, 2022
88f4cd0
Fix nits in documentation
omarespejel Apr 25, 2022
d0268a9
Move errors in _check_fastai_fastcore_pyproject_versions to condition…
omarespejel Apr 25, 2022
5ffd543
Misc improvements
osanseviero Apr 26, 2022
f36093e
Move the versions checks for fastai and fastcore to "else"'s
omarespejel Apr 26, 2022
cd4648a
Move the versions checks for fastai and fastcore to "else"'s
omarespejel Apr 26, 2022
1795ce8
Merge branch 'add-fastai-mixin' of https://github.com/omarespejel/hug…
omarespejel Apr 26, 2022
f9dd4b0
Merge branch 'main' of https://github.com/huggingface/huggingface_hub…
omarespejel Apr 26, 2022
465e0fa
Black reformat
omarespejel Apr 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
314 changes: 314 additions & 0 deletions src/huggingface_hub/fastai_mixin.py
@@ -0,0 +1,314 @@
###########################################################################################################
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
# Easily store and download `fastai>=2.4` models into the HF Hub.
#
# Goal:
# (1) Add upstream support: push a fastai learner to the HF Hub. See `save_fastai_learner` and `push_to_hub_fastai`.
# (2) Add downstream support: download a fastai learner from the hub. See `from_pretrained_fastai`.
#
# Limitations and next steps:
# - Possibly go from storing/downloading a `fastai.learner` to saving the weights directly into de hub.
# - Examine whether it is worth implementing `fastai <2.4` versions.
###########################################################################################################
omarespejel marked this conversation as resolved.
Show resolved Hide resolved

import json
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
import logging
import os
import packaging.version
from pathlib import Path
from typing import Any, Dict, Optional, Union

# TODO - add to huggingface_hub.constants the constant FASTAI_LEARNER_NAME: same name to all the .pkl models pushed to the hub.
from huggingface_hub import ModelHubMixin
from huggingface_hub.constants import CONFIG_NAME

from huggingface_hub.file_download import (
get_fastai_version,
get_fastcore_version,
)
from huggingface_hub.hf_api import HfFolder, HfApi
from huggingface_hub.repository import Repository
from huggingface_hub.snapshot_download import snapshot_download

# Verify if we are using the right fastai version.
if packaging.version.Version(get_fastai_version()) < packaging.version.Version("2.4"):
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
raise ImportError(
f"`push_to_hub_fastai` and `from_pretrained_fastai` require a fastai>=2.4 version, but you are using fastai version {get_fastai_version()} which is incompatible. Run, for example, `pip install fastai==2.5.1`."
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
)

# Verify if we are using the right fastcore version.
if packaging.version.Version(get_fastcore_version()) < packaging.version.Version(
"1.3.27"
):
raise ImportError(
f"`push_to_hub_fastai` and `from_pretrained_fastai` require a fastcore>=1.3.27 version, but you are using fastcore version {get_fastcore_version()} which is incompatible. Run, for example, `pip install fastcore==1.3.27`."
)

logger = logging.getLogger(__name__)

# Verify availability of `load_learner`.
try:
from fastai.learner import load_learner
except ImportError as error:
logger.error(
error.__class__.__name__
+ f": `push_to_hub_fastai` and `from_pretrained_fastai` require a fastai>=2.4 version, but you are using fastai version {get_fastai_version()} which is incompatible. Run, for example, `pip install fastai==2.5.1`."
)

# Define template for a auto-generated README.md
README_TEMPLATE = """---
tags:
- fastai
---

# Amazing!

Congratulations on hosting your fastai model on the 🤗Hub!
omarespejel marked this conversation as resolved.
Show resolved Hide resolved

# Some next steps
1. Fill out this model card with more information ([documentation here](https://huggingface.co/docs/hub/model-repos))!

2. Create a demo in Gradio or Streamlit using the 🤗Spaces ([documentation here](https://huggingface.co/docs/hub/spaces)).

3. Join our fastai community on the Hugging Face Discord!

Greetings fellow fastlearner 🤝!
osanseviero marked this conversation as resolved.
Show resolved Hide resolved

"""

# Define template for a auto-generated config with fastai and fastcore versions
CONFIG_TEMPLATE = dict(
fastai_version=get_fastai_version(), fastcore_version=get_fastcore_version()
)


def _create_model_card(repo_dir: Path):
"""Creates a model card for the repository.

repo_dir:
Specify directory in which you want to create a model card.
"""
readme_path = repo_dir / "README.md"
readme = ""
if readme_path.exists():
with readme_path.open("r", encoding="utf8") as f:
readme = f.read()
else:
readme = README_TEMPLATE
with readme_path.open("w", encoding="utf-8") as f:
f.write(readme)


def save_fastai_learner(
learner, save_directory: str, config: Optional[Dict[str, Any]] = None
):
"""Saves a fastai learner to save_directory in pickle format. Use this if you're using Learners.

learner:
The `fastai.learner` you'd like to save.
save_directory (:obj:`str`):
Specify directory in which you want to save the fastai learner.
config (:obj:`dict`, `optional`):
Configuration object. Will be uploaded as a .json file. Example: 'https://huggingface.co/espejelomar/fastai-pet-breeds-classification/blob/main/config.json'.

TODO - Save weights and model structure instead of the built learner.
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
"""

# creating path
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
os.makedirs(save_directory, exist_ok=True)

# saving config
# if user provides config then we update it with the fastai and fastcore versions in CONFIG_TEMPLATE.
if config:
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
if not isinstance(config, dict):
raise RuntimeError(
f"Provided config should be a dict. Got: '{type(config)}'"
)
path = os.path.join(save_directory, CONFIG_NAME)
with open(path, "w") as f:
json.dump({**config, **CONFIG_TEMPLATE}, f)
else:
path = os.path.join(save_directory, CONFIG_NAME)
with open(path, "w") as f:
json.dump(CONFIG_TEMPLATE, f)

# creating README.md if none exist
_create_model_card(Path(save_directory))

# saving learner
learner.export(os.path.join(save_directory, "model.pkl"))
omarespejel marked this conversation as resolved.
Show resolved Hide resolved


def from_pretrained_fastai(*args, **kwargs):
return FastaiModelHubMixin.from_pretrained(*args, **kwargs)


def push_to_hub_fastai(
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
learner,
muellerzr marked this conversation as resolved.
Show resolved Hide resolved
repo_path_or_name: Optional[str] = None,
repo_url: Optional[str] = None,
commit_message: Optional[str] = "Add model",
organization: Optional[str] = None,
private: Optional[bool] = None,
api_endpoint: Optional[str] = None,
use_auth_token: Optional[Union[bool, str]] = True,
git_user: Optional[str] = None,
git_email: Optional[str] = None,
config: Optional[dict] = None,
):
"""
Upload learner checkpoint files to the 🤗 Model Hub while synchronizing a local clone of the repo in
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
:obj:`repo_path_or_name`.

Parameters:
model:
The `fastai.learner' you'd like to push to the hub.
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
repo_path_or_name (:obj:`str`, `optional`):
Can either be a repository name for your model or tokenizer in the Hub or a path to a local folder (in
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
which case the repository will have the name of that local folder). If not specified, will default to
the name given by :obj:`repo_url` and a local directory with that name will be created.
repo_url (:obj:`str`, `optional`):
Specify this in case you want to push to an existing repository in the hub. If unspecified, a new
repository will be created in your namespace (unless you specify an :obj:`organization`) with
:obj:`repo_name`.
commit_message (:obj:`str`, `optional`):
Message to commit while pushing. Will default to :obj:`"add model"`.
organization (:obj:`str`, `optional`):
Organization in which you want to push your model or tokenizer (you must be a member of this
organization).
private (:obj:`bool`, `optional`):
Whether or not the repository created should be private (requires a paying subscription).
api_endpoint (:obj:`str`, `optional`):
The API endpoint to use when pushing the model to the hub.
use_auth_token (:obj:`bool` or :obj:`str`, `optional`):
The token to use as HTTP bearer authorization for remote files. If :obj:`True`, will use the token
generated when running :obj:`transformers-cli login` (stored in :obj:`~/.huggingface`). Will default to
:obj:`True`.
git_user (``str``, `optional`):
will override the ``git config user.name`` for committing and pushing files to the hub.
git_email (``str``, `optional`):
will override the ``git config user.email`` for committing and pushing files to the hub.
config (:obj:`dict`, `optional`):
Configuration object to be saved alongside the model weights.

Returns:
The url of the commit of your model in the given repository.
"""

if repo_path_or_name is None and repo_url is None:
raise ValueError("You need to specify a `repo_path_or_name` or a `repo_url`.")

if isinstance(use_auth_token, bool) and use_auth_token:
token = HfFolder.get_token()
elif isinstance(use_auth_token, str):
token = use_auth_token
else:
token = None

if token is None:
raise ValueError(
"You must login to the Hugging Face hub on this computer by typing `huggingface-cli login` and "
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
"entering your credentials to use `use_auth_token=True`. Alternatively, you can pass your own "
"token as the `use_auth_token` argument."
)

if repo_path_or_name is None:
repo_path_or_name = repo_url.split("/")[-1]

# If no URL is passed and there's no path to a directory containing files, create a repo
if repo_url is None and not os.path.exists(repo_path_or_name):
repo_name = Path(repo_path_or_name).name
repo_url = HfApi(endpoint=api_endpoint).create_repo(
token,
repo_name,
organization=organization,
private=private,
repo_type=None,
exist_ok=True,
)

repo = Repository(
repo_path_or_name,
clone_from=repo_url,
use_auth_token=use_auth_token,
git_user=git_user,
git_email=git_email,
)
repo.git_pull(rebase=True)

if config:
save_fastai_learner(learner, repo_path_or_name, config=config)
else:
save_fastai_learner(learner, repo_path_or_name)
omarespejel marked this conversation as resolved.
Show resolved Hide resolved

# Commit and push!
repo.git_add(auto_lfs_track=True)
repo.git_commit(commit_message)
return repo.git_push()
omarespejel marked this conversation as resolved.
Show resolved Hide resolved


class FastaiModelHubMixin(ModelHubMixin):
def __init__(self, *args, **kwargs):
"""
Mixin class to implement model download and upload from fastai learners.

# Downloading Learner from hf-hub:
omarespejel marked this conversation as resolved.
Show resolved Hide resolved
Example::

>>> from huggingface_hub import from_pretrained_fastai
>>> model = from_pretrained_fastai("username/mymodel@main")

# TODO - Define if proceeding with a class (FastaiModelHubMixin) would be ideal for fastai and proceed with implementation
# otherwise, proceed with just functions.

"""

def _save_pretrained(self, save_directory):
save_fastai_learner(self, save_directory)

@classmethod
def _from_pretrained(
cls,
model_id,
revision,
cache_dir,
force_download,
proxies,
resume_download,
local_files_only,
use_auth_token,
**model_kwargs,
):
"""Here we just call save_fastai_learner function so both the mixin and functional APIs stay in sync.

TODO - Some args above aren't used since we are calling snapshot_download instead of hf_hub_download.
"""

# TODO - Figure out what to do about these config values. Config is not going to be needed to load model
cfg = model_kwargs.pop("config", None)

# Root is either a local filepath matching model_id or a cached snapshot
if not os.path.isdir(model_id):
storage_folder = snapshot_download(
repo_id=model_id, revision=revision, cache_dir=cache_dir
)
else:
storage_folder = model_id

# Using the pickle document in the downloaded list
docs = os.listdir(storage_folder)
for doc in docs:
if doc.endswith(".pkl"):
pickle = doc
break

logger.info(
f"Using `fastai.learner` stored in {os.path.join(model_id, pickle)}."
)
print(f"Using `fastai.learner` stored in {os.path.join(model_id, pickle)}.")

model = load_learner(os.path.join(storage_folder, pickle))

# For now, we add a new attribute, config, to store the config loaded from the hub/a local dir.
model.config = cfg

return model
36 changes: 36 additions & 0 deletions src/huggingface_hub/file_download.py
Expand Up @@ -70,6 +70,22 @@
except importlib_metadata.PackageNotFoundError:
pass

_fastai_version = "N/A"
_fastai_available = False
try:
_fastai_version: str = importlib_metadata.version("fastai")
_fastai_available = True
except importlib_metadata.PackageNotFoundError:
pass

_fastcore_version = "N/A"
_fastcore_available = False
try:
_fastcore_version: str = importlib_metadata.version("fastcore")
_fastcore_available = True
except importlib_metadata.PackageNotFoundError:
pass


def is_torch_available():
return _torch_available
Expand All @@ -79,6 +95,22 @@ def is_tf_available():
return _tf_available


def is_fastai_available():
return _fastai_available


def get_fastai_version():
return _fastai_version


def is_fastcore_available():
return _fastcore_available


def get_fastcore_version():
return _fastcore_version


def hf_hub_url(
repo_id: str,
filename: str,
Expand Down Expand Up @@ -181,6 +213,10 @@ def http_user_agent(
ua += f"; torch/{_torch_version}"
if is_tf_available():
ua += f"; tensorflow/{_tf_version}"
if is_fastai_available():
ua += f"; fastai/{_fastai_version}"
if is_fastcore_available():
ua += f"; fastcore/{_fastcore_version}"
if isinstance(user_agent, dict):
ua += "; " + "; ".join(f"{k}/{v}" for k, v in user_agent.items())
elif isinstance(user_agent, str):
Expand Down