Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use idn_to_ascii function instead of idna_convert library #785

Merged
merged 1 commit into from May 30, 2023

Conversation

jtojnar
Copy link
Contributor

@jtojnar jtojnar commented Feb 16, 2023

The idna_convert library is outdated and there is now a native PHP function to convert IDNs to Punycode.
It requires intl extension or a polyfill.

compress_parse_url will append ? and # to the URI even if there is no query string or hash fragment, and insert // even when there is no authority part. To avoid this and the breakage of tests this would cause, let’s call the function only for files having a non-empty authority part of the URI (i.e. non-local files) that contains characters that are not considered printable ASCII (such as bytes of UTF-8 encoded Unicode characters).

In the future, we might leave the IDN support to HTTP library to deal with.

Fixes: #538
Supersedes: #776

$parsed = \SimplePie\Misc::parse_url($link);
$this->link = \SimplePie\Misc::compress_parse_url($parsed['scheme'], $idn->encode($parsed['authority']), $parsed['path'], $parsed['query'], $parsed['fragment']);
if ($parsed['authority'] !== '' && !ctype_print($parsed['authority'])) {
$authority = \idn_to_ascii($parsed['authority'], \IDNA_NONTRANSITIONAL_TO_ASCII, \INTL_IDNA_VARIANT_UTS46);
Copy link
Contributor Author

@jtojnar jtojnar Feb 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like the original code was already broken on authority strings containing non-ASCII characters in userinfo part of the URI, e.g. čížek@háčkyčárky.cz:

  • This is what idna_convert gives: xn--ek-mja1kox@xn--hkyrky-ptac70bc.cz
  • This is what Firefox converts: %C4%8D%C3%AD%C5%BEek@xn--hkyrky-ptac70bc.cz

It probably does not make sense to attempt to fix it when it will likely be corrected with switch to PSR-18.

Edit: Looks like they are deprecated anyway https://www.rfc-editor.org/rfc/rfc7540#section-8.1.2.3

CHANGELOG.md Outdated Show resolved Hide resolved
README.markdown Outdated Show resolved Hide resolved
@@ -1,969 +0,0 @@
<?php
// {{{ license
Copy link
Contributor

@Art4 Art4 Feb 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this file might be a breaking change. What do you think about keeping this file and simply triggering a deprecation error?

<?php

declare(strict_types=1);

trigger_error('Using class `idna_convert` from `idn/idna_convert.class.php` is deprecated since SimplePie 1.9.0, the native `idn_to_ascii()` function is used instead.', \E_USER_DEPRECATED);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would argue that this is not a part of API – it is not part of SimplePie, it is not listed in the API docs nor was it documented anywhere else.

I have listed it in the Removed section of changelog so that if someone uses it for themselves, they can just copy it from older version.

@Art4
Copy link
Contributor

Art4 commented Feb 17, 2023

I like this approach. idn_to_ascii() should available in the most PHP environments and will work out of the box without requiring any files. If it is not present, one can define it using the Algo26\IdnaConvert library or using the polyfill symfony/polyfill-intl-idn.

This PR fixes #538.

@Art4 Art4 mentioned this pull request Feb 17, 2023
48 tasks
@Art4 Art4 mentioned this pull request Mar 15, 2023
@jtojnar jtojnar mentioned this pull request Mar 18, 2023
The `idna_convert` library is outdated and there is now a native PHP function to convert IDNs to Punycode.
It requires `intl` extension or a [polyfill](https://github.com/symfony/polyfill-intl-idn).

`compress_parse_url` will append `?` and `#` to the URI even if there is no query string or hash fragment, and insert `//` even when there is no authority part. To avoid this and the breakage of tests this would cause, let’s call the function only for files having a non-empty authority part of the URI (i.e. non-local files) that contains characters that are not considered printable ASCII (such as bytes of UTF-8 encoded Unicode characters).

In the future, we might leave the IDN support to HTTP library to deal with.
@Art4
Copy link
Contributor

Art4 commented May 26, 2023

@mblaney This is ready for merge.

@mblaney mblaney merged commit 06b2687 into simplepie:master May 30, 2023
8 checks passed
@jtojnar jtojnar deleted the idn branch May 30, 2023 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bundled idn/idna_convert.class.php can be replaced with mso/idna-convert
3 participants