Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace toml by stdlib counterparts for reading, evaluate options for writing #3690

Closed
astrojuanlu opened this issue Mar 8, 2024 · 4 comments

Comments

@astrojuanlu
Copy link
Member

astrojuanlu commented Mar 8, 2024

We use toml in exactly 3 modules, mostly for reading. In 1 place we are using it for writing

with open(file_path, "w") as file:
toml.dump(data, file)

TOML capabilities in Python are evolving and it would be good to adapt. On the other hand, toml, the library we use, was last updated in 2020 and is basically abandoned.

First attempt: #2569 (comment)

Comes from: kedro-org/kedro-plugins#333 (comment)

Loading

Since Python 3.11, the standard library includes a native TOML library for reading files, https://docs.python.org/3/library/tomllib.html, which according to PEP 680 is based on https://pypi.org/project/tomli/

Therefore we could replace toml in our dependencies by something like

tomli ; python_version < '3.11'

and remove that when we drop support for Python 3.10 (October 2026). This would save 1 dependency for Python >= 3.11.

Writing

After a long discussion it was decided to not include TOML writing capabilities in the standard library. There are 3 options for it:

It's likely that we want style-preserving capabilities, since our kedro new hooks modify stuff from the user pyproject.toml. I have no idea what are the guarantees that current toml provides.

@astrojuanlu
Copy link
Member Author

astrojuanlu commented Mar 8, 2024

Notice that currently users of all versions of Python are forced to install a TOML-writing dependency just for kedro new, and this proposal doesn't alleviate that. See #3659 for a further exploration of this issue.

@DimedS
Copy link
Contributor

DimedS commented Mar 8, 2024

Thank you for the detailed analysis, @astrojuanlu .

Therefore we could replace toml in our dependencies by something like

tomli ; python_version < '3.11'

and remove that when we drop support for Python 3.10 (October 2026). This would save 1 dependency for Python >= 3.11.

I agree with that. However, does this mean we need to adjust all instances of our reading code like that?:

if sys.version_info >= (3, 11):
    with open(toml_file_path, 'rb') as f:
        toml_data = tomllib.load(f)
else:
    with open(toml_file_path, 'rb') as f:
        toml_data = tomli.load(f)

It's likely that we want style-preserving capabilities, since our kedro new hooks modify stuff from the user pyproject.toml. I have no idea what are the guarantees that current toml provides.

It looks like we only update pyproject.toml when executing the kedro new command, and this file is sourced from our protected starters repository. Therefore, the need for style preservation in pyproject.toml is not a significant concern at present. However, I also note that tomlkit is a more popular library, potentially implying it's more feature-rich but heavier. Given our limited requirements, tomli-w might suffice for our purposes.

@astrojuanlu
Copy link
Member Author

astrojuanlu commented Mar 8, 2024

However, does this mean we need to adjust all instances of our reading code like that?:

We could do this instead:

try:
    import tomllib  # stdlib
except ImportError:  # old Python versions
    import tomli as tomllib

and avoid changing the code itself

@merelcht
Copy link
Member

Discussed this issue in grooming (11/3/2024) and we decided to close it for the time being because toml isn't causing any issues right now and the proposed changed don't hugely improve dependencies for users and introduce some "complexity" with different python versions and reading vs writing.

@merelcht merelcht closed this as not planned Won't fix, can't repro, duplicate, stale Mar 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

No branches or pull requests

3 participants