Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A metadata module with a data class for core metadata #518

Merged
merged 13 commits into from Jun 17, 2022

Conversation

brettcannon
Copy link
Member

Part of #383.

@brettcannon brettcannon mentioned this pull request Mar 8, 2022
@brettcannon brettcannon linked an issue Mar 9, 2022 that may be closed by this pull request
packaging/metadata.py Outdated Show resolved Hide resolved
packaging/metadata.py Outdated Show resolved Hide resolved
packaging/metadata.py Outdated Show resolved Hide resolved
packaging/metadata.py Outdated Show resolved Hide resolved
@pfmoore
Copy link
Member

pfmoore commented May 2, 2022

Some notes that I came up with when trying to write from_dict and to_dict methods:

  1. The subtle difference between the JSON-compatible form and the field/argument names here is a pain. Specifically, having to continually remember which is classifier (singular) and which is classifiers (plural) is tedious and error prone. Having said that, using the plural forms in the class feels correct, and the pain is only in the implementation, so I think it's the right call.
  2. The Metadata class doesn't have a metadata_version attribute, so it's not possible to round-trip between the dictionary form and the class form without losing information. This is something I've needed (read metadata, convert to dict form and then to a class, modify the class and write back out). So I'd recommend adding a metadata_version attribute. Callers can preserve the version manually, so I guess it's not a deal-breaker, but it feels awkward.
  3. The best way to convert from Author-Email to a list of _NameAndEmail tuples seems to be email.utils.getaddresses. This returns "" for a missing name, rather than None (which _NameAndEmail implies). Maybe don't bother allowing for None as the name? On the other hand, email.utils.formataddr (which would be used for the inverse transformation) will accept None, so there's no real harm allowing it.

@brettcannon
Copy link
Member Author

  1. The subtle difference between the JSON-compatible form and the field/argument names here is a pain. Specifically, having to continually remember which is classifier (singular) and which is classifiers (plural) is tedious and error prone. Having said that, using the plural forms in the class feels correct, and the pain is only in the implementation, so I think it's the right call.

That's my feeling as well.

2. The Metadata class doesn't have a metadata_version attribute, so it's not possible to round-trip between the dictionary form and the class form without losing information. This is something I've needed (read metadata, convert to dict form and then to a class, modify the class and write back out). So I'd recommend adding a metadata_version attribute. Callers can preserve the version manually, so I guess it's not a deal-breaker, but it feels awkward.

The idea is you will specify what metadata version to emit, and so you could round-trip if you knew what version you were parsing. But if you would find it useful to simply record it but not actually use it to write out then I'm okay with that in the name of data preservation.

But if that's the case, then the question is what to do for 2.0? Make what gets recorded a string or coerce the version to 2.1 and thus record the enum value?

3. The best way to convert from Author-Email to a list of _NameAndEmail tuples seems to be email.utils.getaddresses. This returns "" for a missing name, rather than None (which _NameAndEmail implies). Maybe don't bother allowing for None as the name? On the other hand, email.utils.formataddr (which would be used for the inverse transformation) will accept None, so there's no real harm allowing it.

I'm good either way: None or tell folks to specify the empty string for a lack of value since not will still do the right thing. Probably #383 (comment) covers the latest key details on my thoughts about methods on this class.

@brettcannon brettcannon marked this pull request as ready for review June 13, 2022 04:29
@brettcannon
Copy link
Member Author

I got a request to commit the data class for those that want to start using the class now and use methods for parsing and emitting later when they get implemented. As such, I am marking this PR as ready to review. There are no docs as of yet, but I will write those either here if this PR gets held up or as a later PR.

@brettcannon brettcannon changed the title A bare-bones metadata module A metadata module with a data class for core metadata Jun 15, 2022
@brettcannon
Copy link
Member Author

I have now added documentation.

@brettcannon
Copy link
Member Author

@pradyunsg since this API has been discussed for so long, I'm going to assume this is uncontroversial and I can merge it in the next couple of days. Obviously let me know if you disagree.

@pradyunsg
Copy link
Member

Feel free to. No need to hold up on me. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a metadata API
3 participants