Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Encode fragmentIdentifier with encodeURI method not encodeURIComponent at stringifyUrl #355

Conversation

hanbin9775
Copy link
Contributor

Fixes: #346

FragmentIdentifier should be encoded with encodeURI method not encodeURIComponent method.
Also added test code about this fix.

@sindresorhus
Copy link
Owner

Are you sure it's correct to use encodeURI? AFAIK, it's meant for complete URLs. Can you link to a source about it being correct to use in this situation?

@hanbin9775
Copy link
Contributor Author

hanbin9775 commented Dec 11, 2022

I did some investigation, and found out something wrong about my PR. Thank you for the feedback!

URL hash setter in Node.js

Node.js implements url.hash setter with this following.

Invalid URL characters included in the value assigned to the hash property are percent-encoded.

Url fragments are percent encoded with following percent-encode set. (As known as fragment percent-encode set)

C0 control percent-encode set and code points U+0020, U+0022, U+003C, U+003E, and U+0060.

C0 control percent-encode set is

code points in range U+0000 to U+001F (inclusive) and all code points greater than U+007E.

You can find list of code points in here.

This is also described in URL Standard. (https://url.spec.whatwg.org/#fragment-state)

encodeURI, encodeURIComponent

The only difference with encodeURI and encodeURIComponent is whether the following character set is encoded: ;/?:@&=+$,#
(https://tc39.es/ecma262/multipage/global-object.html#sec-encode)

Both methods encodes characters which is not the ASCII word characters and "-.!~*'()".

Conclusion: difference with encodeURI/encodeURIComponent and url.hash setter

Found out percent-encode sets are different.
If we express the character set that will display different encoding result, it will be as follows

  • Universal Character Set - (fragment percent-encode set + the ASCII word characters set + -.!~*'() + ;/?:@&=+$,#(only with encodeURI))

For example code point U+005B ( [ ) is not included on fragment percent-encode set but url.hash setter will percent-encode it.

const fragement = '[';
const x = new URL(url);
x.hash = fragment;

console.log(encodeURI(fragment)); // '%5B'
console.log(encodeURIComponent(fragment)); // '%5B'

console.log(url.hash); // '['

So I believe the easiest implement for fixing #355 issue will be like this.

as-is

if (object.fragmentIdentifier) {
	hash = `#${options[encodeFragmentIdentifier] ? encode(object.fragmentIdentifier, options) : object.fragmentIdentifier}`;
}

to-be

if (object.fragmentIdentifier) {
	const newURL = new URL(url);
	newURL.hash = object.fragmentIdentifier;
	hash = options[encodeFragmentIdentifier] ? newURL.hash : `#${object.fragmentIdentifier}`;
}

Or we could create a new function for encoding url fragment. encodeURIFragment

@hanbin9775 hanbin9775 force-pushed the fix/encode-fragmentIdentifier-with-encodeURI-method branch from 1d24ce0 to 12456f9 Compare December 11, 2022 08:33
@hanbin9775 hanbin9775 force-pushed the fix/encode-fragmentIdentifier-with-encodeURI-method branch from 12456f9 to ad639c1 Compare December 11, 2022 08:34
@sindresorhus sindresorhus merged commit 16a7b8f into sindresorhus:main Dec 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

why query-string encode the fragmentIdentifier?
2 participants