Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A '@' character in the host part of file URLs #805

Open
hayatoito opened this issue Dec 5, 2023 · 2 comments
Open

A '@' character in the host part of file URLs #805

hayatoito opened this issue Dec 5, 2023 · 2 comments
Labels
topic: file Aren't file: URLs the best? topic: parser

Comments

@hayatoito
Copy link
Member

hayatoito commented Dec 5, 2023

(Reported in https://crbug.com/1502849)

It appears that Windows uses file URLs with '@' (U+0040) characters in their host parts, such as file://webdavserver.net@ssl/a.pdf.

However, according to my understanding, file://webdavserver.net@ssl/a.pdf is an invalid URL in the URL Standard because '@' is considered a forbidden host code point.

To ensure compatibility with Windows file URLs, should we consider allowing the '@' character in the host part of file URLs?

I'd appreciate hearing opinions of the URL Standard folks on this matter.

@annevk annevk added topic: parser topic: file Aren't file: URLs the best? labels Dec 5, 2023
@annevk
Copy link
Member

annevk commented Dec 5, 2023

It seems reasonable to allow, but I wonder if it would be possible for Chromium to determine the complete set of changes needed for it to not have platform-divergent behavior. At least I suspect that making them all at once would allow for an easier rollout.

@karwa
Copy link
Contributor

karwa commented Dec 5, 2023

A single @ in the authority section (username, password, hostname, port) generally delimits the credentials from the hostname.

Let's take any other URL scheme, e.g. HTTP: http://webdavserver.net@ssl/

  • "webdavserver.net" would be the username
  • "ssl" would be the hostname

Which is clearly not what the reporter wants to happen.

What's more, this has been the accepted interpretation for at least the last 30 years (going back to RFC-1738). I doubt many URL parsers are going to interpret file://webdavserver.net@ssl/ as having a hostname containing an @ sign, so the output of the URL parser must keep the @ escaped in order to properly encode its understanding of the URL components. file://webdavserver.net%40ssl/ is semantically correct.

I think the actual problem is that hostnames in file URLs are not able to contain percent-encoding. I looked in to this in depth a while back, and found that:

  • UNC server names can contain spaces, which obviously must be escaped. Chromium even has test cases for this (see linked issue below), so you'll probably hit this sooner or later.
  • Windows allows pre-canonicalised paths, which are expressed using UNC syntax with the hostname "?" (e.g. \\?\C\SomePath). These cannot be expressed using file URLs because the hostname would have to be %3F.

See #599

@whatwg whatwg deleted a comment from Ur100 Feb 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: file Aren't file: URLs the best? topic: parser
Development

No branches or pull requests

3 participants