Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add escaping to license field #2640

Merged
merged 1 commit into from May 9, 2021
Merged

Add escaping to license field #2640

merged 1 commit into from May 9, 2021

Conversation

cdce8p
Copy link
Contributor

@cdce8p cdce8p commented Apr 15, 2021

Summary of changes

Fix handling of multiline license string by adding rfc822 escaping.

The license is allowed to span over multiple lines, see Metadata spec. Currently, setuptools doesn't autoescape such inputs. The result of which is an PKG-INFO file that isn't parsed correctly by importlib.metadata.

# example setup.cfg
[metadata]
license = This is a pretty 
    long license

Current results

# PKG-INFO
...
Author: ...
License: This is a pretty
long license
Description: This is the package description
        that also spans multiple lines
...
from importlib.metadata import metadata
metadata('custom-pkg')['License']      # -> 'This is a pretty'
metadata('custom-pkg')['Description']  # -> 'long license\nDescription: This ...\n        that ... lines'

With change

# PKG-INFO
...
Author: ...
License: This is a pretty
        long license
Description: This is the package description
        that also spans multiple lines
...
# with change
from importlib.metadata import metadata
metadata('custom-pkg')['License']      # -> 'This is a pretty\n        long license'
metadata('custom-pkg')['Description']  # -> 'This ...\n        that ... lines'

Note that importlib.metadata doesn't remove the escaping. Neither for Description nor License.

Pull Request Checklist

@jaraco
Copy link
Member

jaraco commented May 9, 2021

Note that importlib.metadata doesn't remove the escaping. Neither for Description nor License.

In importlib.metadata 4, it does remove the line continuations.

@EvgenKo423
Copy link

The license is allowed to span over multiple lines, see Metadata spec.

  1. Core metadata specification refers to RFC 822 for its formatting rules. The reference link at the bottom of this spec describes a convenience representation process called "folding", according to which CRLF followed by any number of spaces is equal to a single space. That is, the example License field does NOT have a multi-line value (there can't be such according to RFC)!
    To overcome this limitation the Description field have to use a | character to be able to preserve line breaks.

  2. distutils documentation says that the license key is a

    ‘short string’
          A single line of text, not more than 200 characters.

    which is consistent with the metadata spec, considering the above.

Conclusions (as far as I understood):

  1. The license keyword should not be multi-line;
  2. The License field of an RFC 822-encoded message can be represented in multiple lines, as well as any other, but should unfold to a single line anyway.

@jaraco
Copy link
Member

jaraco commented May 19, 2022

Yes, I agree. license should not be multiline.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants