Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with UTF-8 encoded href attributes #1142

Open
rustyconover opened this issue Jan 18, 2021 · 1 comment
Open

Issues with UTF-8 encoded href attributes #1142

rustyconover opened this issue Jan 18, 2021 · 1 comment

Comments

@rustyconover
Copy link
Contributor

Hi Team,

Assume you have an HTML file that is UTF-8 encoded, and there is an emoji included in an href attribute of the <a> tag. The emoji is not escaped or anything else it is just a properly formed UTF-8 glyph. Such as:

<a href="https://rusty.today/🏈">Test link</a>

When attempting to move that asset this error is thrown:

URIError: moveAssets transform: URI malformed
    at encodeURIComponent (<anonymous>)
    at String.replace (<anonymous>)
    at AssetGraph.addAsset (/Users/rusty/Development/assetgraph/lib/AssetGraph.js:241:45)
    at Html._vivifyRelation (/Users/rusty/Development/assetgraph/lib/assets/Asset.js:1090:50)
    at /Users/rusty/Development/assetgraph/lib/assets/Asset.js:1108:34
    at Array.map (<anonymous>)
    at Html.get outgoingRelations [as outgoingRelations] (/Users/rusty/Development/assetgraph/lib/assets/Asset.js:1107:73)
    at gatherExternalRelations (/Users/rusty/Development/assetgraph/lib/assets/Asset.js:1180:44)
    at Html.get externalRelations [as externalRelations] (/Users/rusty/Development/assetgraph/lib/assets/Asset.js:1187:7)
    at Html.set url [as url] (/Users/rusty/Development/assetgraph/lib/assets/Asset.js:1022:37)
    at /Users/rusty/Development/assetgraph/lib/util/assetMover.js:21:17
    at moveAssets (/Users/rusty/Development/assetgraph/lib/transforms/moveAssets.js:10:7)
    at Immediate._onImmediate (/Users/rusty/Development/assetgraph/lib/AssetGraph.js:653:25)

This is because this code in addAsset() is called:

          assetConfig.url = assetConfig.url.replace(
            /[^\x21-\x7f]/g,
            encodeURIComponent
          );

That code mangles the nicely formed emoji character.

Rusty

@papandreou
Copy link
Member

To be fair, unencoded emojis are invalid per https://tools.ietf.org/html/rfc3986 as they aren't ascii. But it would probably be good to try to be as forgiving as browsers are.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants