Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backwards incompatible change to DNSName #3951

Merged
merged 8 commits into from Oct 11, 2017

Conversation

reaperhulk
Copy link
Member

During this release cycle we decided to officially deprecate passing U-labels to our GeneralName constructors. At first we tried changing this in a purely backwards compatible way, but get_values_for_type made that untenable. This PR modifies DNSName to take three different types. U-label strings (which raises a deprecation warning), A-label strings (the new preferred type), and bytes (which are assumed to be decodable to unicode strings). The latter, while supported, is primarily intended for use by our parser and allows us to return the actual encoded data in a certificate even if it has not been properly encoded to A-label before the certificate is created. (Of course, if the certificate contains invalid utf8 sequences this will still fail, but let's handle one catastrophic failure at a time).

Note that this change is only backwards incompatible in the case where the user is passing a string that contains an internationalized domain name U-label. In that scenario the user will pass a string containing unicode characters and receive back a string encoded to A-label when calling value.

refs #3943

Once URI and RFC822Name are complete we'll also need to update the changelog.

raise TypeError("value must be bytes")

self._bytes_value = value
elif isinstance(value, six.binary_type):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we accepting bytes here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't accept bytes then if a cert has actual unicode characters in it we will (incorrectly) encode it to an A-label. With a bytes path we can preserve the exact data from the cert (provided it is a valid UTF-8 sequence).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, is it better to accept bytes, or to have the internal creation path do n = DNSName.__new__(); n._value = value, hacky, but doesn't expand our public API.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with bypassing the constructor. I wasn't planning on documenting the bytes path anyway (callers should pass strings).

Only nuance then is to make sure we have a test that allows a parsed unicode generalname to properly be encoded. We may already have that test but I'll look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can probably find you a garbage cert from a public CA if you need it.

with pytest.warns(utils.DeprecatedIn21):
name = x509.DNSName(u"*.\xf5\xe4\xf6\xfc.example.com")
assert name.bytes_value == b"*.xn--4ca7aey.example.com"
name = x509.DNSName(u".xn--4ca7aey.example.com")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you drop the * fromt his test?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be a mistake.

During this release cycle we decided to officially deprecate passing
U-labels to our GeneralName constructors. At first we tried changing
this in a purely backwards compatible way but get_values_for_type made
that untenable. This PR modifies DNSName to take three different types.
U-label strings (which raises a deprecation warning), A-label strings
(the new preferred type), and bytes (which are assumed to be decodable
to unicode strings). The latter, while supported, is primarily intended
for use by our parser and allows us to return the actual encoded data in
a certificate even if it has not been properly encoded to A-label before
the certificate is created. (Of course, if the certificate contains
invalid utf8 sequences this will still fail, but let's handle one
catastrophic failure at a time).
@reaperhulk reaperhulk force-pushed the x509-dnsname-backwards-incompatible branch from 0be8659 to 0eb3d67 Compare October 10, 2017 02:55
@alex alex merged commit ed32105 into pyca:master Oct 11, 2017
@reaperhulk reaperhulk deleted the x509-dnsname-backwards-incompatible branch October 11, 2017 00:12
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 12, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants