Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Semicolons are erroneously encoded in query params #85

Open
marcelklehr opened this issue Jan 15, 2019 · 4 comments
Open

Semicolons are erroneously encoded in query params #85

marcelklehr opened this issue Jan 15, 2019 · 4 comments

Comments

@marcelklehr
Copy link

marcelklehr commented Jan 15, 2019

Hey,

I've had a user report the following normalization:

normalize('https://my.otrs.dom/index.pl?Action=AgentTicketZoom;TicketID=707128') == 'https://my.otrs.dom/index.pl?Action=AgentTicketZoom%3BTicketID%3D707128'

...which according to the user didn't preserve the semantics of the URL.

Checking the RFC, it appears that ; and = are part of the sub-delims non-terminal which defines a section of reserved characters that should not be encoded.

Am I missing something?

@sindresorhus
Copy link
Owner

It's just URL encoded. It doesn't change any semantics of the URL:

const a = 'https://my.otrs.dom/index.pl?Action=AgentTicketZoom;TicketID=707128';
const b = 'https://my.otrs.dom/index.pl?Action=AgentTicketZoom%3BTicketID%3D707128';

new URL(a).searchParams.get('Action') === new URL(b).searchParams.get('Action')
//=> true

@marcelklehr
Copy link
Author

Mh. I assume this is because the URL implementation simply treats ; as data, which is fine, but it's not canonical.

The above-mentioned RFC says:

  reserved    = gen-delims / sub-delims

 gen-delims  = ":" / "/" / "?" / "#" / "[" / "]" / "@"

 sub-delims  = "!" / "$" / "&" / "'" / "(" / ")"
             / "*" / "+" / "," / ";" / "="

The purpose of reserved characters is to provide a set of delimiting
characters that are distinguishable from other data within a URI.
URIs that differ in the replacement of a reserved character with its
corresponding percent-encoded octet are not equivalent. Percent-
encoding a reserved character, or decoding a percent-encoded octet
that corresponds to a reserved character, will change how the URI is
interpreted by most applications.

@marcelklehr
Copy link
Author

marcelklehr commented Jan 16, 2019

Incidentally,

const a = 'https://my.otrs.dom/index.pl?Action=AgentTicketZoom;TicketID=707128';
const b = 'https://my.otrs.dom/index.pl?Action=AgentTicketZoom%3BTicketID%3D707128';

new URL(a).search === new URL(b).search
//=> false

@marcelklehr
Copy link
Author

Alright, so the URI spec is being superseded by the URL spec, which uses the application/x-www-form-urlencoded format for the query string and that doesn't seem to care about the reserved characters in URIs. Wow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants