New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backwards incompatible change to DNSName #3951
Backwards incompatible change to DNSName #3951
Conversation
raise TypeError("value must be bytes") | ||
|
||
self._bytes_value = value | ||
elif isinstance(value, six.binary_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we accepting bytes
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't accept bytes then if a cert has actual unicode characters in it we will (incorrectly) encode it to an A-label. With a bytes path we can preserve the exact data from the cert (provided it is a valid UTF-8 sequence).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, is it better to accept bytes, or to have the internal creation path do n = DNSName.__new__(); n._value = value
, hacky, but doesn't expand our public API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with bypassing the constructor. I wasn't planning on documenting the bytes path anyway (callers should pass strings).
Only nuance then is to make sure we have a test that allows a parsed unicode generalname to properly be encoded. We may already have that test but I'll look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can probably find you a garbage cert from a public CA if you need it.
tests/x509/test_x509_ext.py
Outdated
with pytest.warns(utils.DeprecatedIn21): | ||
name = x509.DNSName(u"*.\xf5\xe4\xf6\xfc.example.com") | ||
assert name.bytes_value == b"*.xn--4ca7aey.example.com" | ||
name = x509.DNSName(u".xn--4ca7aey.example.com") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you drop the *
fromt his test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That'd be a mistake.
During this release cycle we decided to officially deprecate passing U-labels to our GeneralName constructors. At first we tried changing this in a purely backwards compatible way but get_values_for_type made that untenable. This PR modifies DNSName to take three different types. U-label strings (which raises a deprecation warning), A-label strings (the new preferred type), and bytes (which are assumed to be decodable to unicode strings). The latter, while supported, is primarily intended for use by our parser and allows us to return the actual encoded data in a certificate even if it has not been properly encoded to A-label before the certificate is created. (Of course, if the certificate contains invalid utf8 sequences this will still fail, but let's handle one catastrophic failure at a time).
0be8659
to
0eb3d67
Compare
During this release cycle we decided to officially deprecate passing U-labels to our GeneralName constructors. At first we tried changing this in a purely backwards compatible way, but get_values_for_type made that untenable. This PR modifies DNSName to take three different types. U-label strings (which raises a deprecation warning), A-label strings (the new preferred type), and bytes (which are assumed to be decodable to unicode strings). The latter, while supported, is primarily intended for use by our parser and allows us to return the actual encoded data in a certificate even if it has not been properly encoded to A-label before the certificate is created. (Of course, if the certificate contains invalid utf8 sequences this will still fail, but let's handle one catastrophic failure at a time).
Note that this change is only backwards incompatible in the case where the user is passing a string that contains an internationalized domain name U-label. In that scenario the user will pass a string containing unicode characters and receive back a string encoded to A-label when calling
value
.refs #3943
Once URI and RFC822Name are complete we'll also need to update the changelog.