Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL encoding ambiguities #168

Open
bdewater opened this issue Oct 23, 2022 · 0 comments
Open

URL encoding ambiguities #168

bdewater opened this issue Oct 23, 2022 · 0 comments
Assignees

Comments

@bdewater
Copy link

bdewater commented Oct 23, 2022

I noticed a discrepancy how we go about URL encoding.

  • Python and PHP implementations escape url encoding with %. This is mentioned nowhere in the specification.
  • The Go and Javascript do a simple URL encode.
  • The Java implementation also does not prefix but seems to double URL encode and reverse the order.

This results in three different results for the same traceparent example value of congo=t61rcWkgMzE,rojo=00f067aa0ba902b7:

Language URL encoded traceparent
Python, PHP congo%%3Dt61rcWkgMzE%%2Crojo%%3D00f067aa0ba902b7
Go, JS congo%3Dt61rcWkgMzE%2Crojo%3D00f067aa0ba902b7
Java rojo%253D00f067aa0ba902b7%2Ccongo%253Dt61rcWkgMzE

Consulting the spec to look for answers, but also creates more questions:

Meta characters such as ' should be escaped with a slash .

The way this is written implies multiple characters, but only one is given?

  1. URL encode the value e.g. given /param first, that SHOULD become %2Fparam%20first
  2. Escape meta-characters within the raw value; a single quote ' becomes '

This seems to make the Go/Javascript implementations correct. What threw me off initially as my language of choice (Ruby) already encoded ' as %27 so the second point seemed superfluous. Checking the other implementations, I notice that Javascript encodeURIComponent does not do the same as Ruby and leaves the ' intact... but it also begs the questions why this workaround and not have it URL encoded as well?

What I'd like to see:

  • A list of known URL encoding edge cases and their correct encoding form(s). There is already some listed in https://google.github.io/sqlcommenter/spec/#key-value-format but it contains a copy-paste error on the second row. I think the traceparent example and something like foo'bar are useful additions.
  • The unit tests and readme files for implementations in this repository all using at least this list of cases.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants