Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing "Microsoft encoding" is very lenient #60

Open
Carrotman42 opened this issue May 15, 2020 · 2 comments · May be fixed by #61
Open

Parsing "Microsoft encoding" is very lenient #60

Carrotman42 opened this issue May 15, 2020 · 2 comments · May be fixed by #61

Comments

@Carrotman42
Copy link

I was reading through the code for Parse and I noticed that there is no validation on the first and last characters of the uuid-to-parse when the length of the input is 38 characters:

uuid/uuid.go

Lines 38 to 61 in 16ca3ea

// Parse decodes s into a UUID or returns an error. Both the standard UUID
// forms of xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx and
// urn:uuid:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx are decoded as well as the
// Microsoft encoding {xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx} and the raw hex
// encoding: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx.
func Parse(s string) (UUID, error) {
var uuid UUID
switch len(s) {
// xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
case 36:
// urn:uuid:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
case 36 + 9:
if strings.ToLower(s[:9]) != "urn:uuid:" {
return uuid, fmt.Errorf("invalid urn prefix: %q", s[:9])
}
s = s[9:]
// {xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx}
case 36 + 2:
s = s[1:]
// xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
case 32:

I would expect there to be a check that s[0] == '{' && s[37] == '}' rather than simply ignoring those two characters. I am happy to send a PR if requested, it's a simple change. I am just verifying that this loose behavior is not actually desired.

(fwiw, it means that parsing something like a01234567-abcd-cdef-abcd-012345678901a would be parsed without an error, even though it has extra characters at the beginning and end. That's quite unexpected IMO.)

@pborman
Copy link
Collaborator

pborman commented May 19, 2020

Thank you for pointing this out. Yes, this is a bug and should be fixed. Would be happy to have you send in a PR for this.

Carrotman42 added a commit to Carrotman42/uuid that referenced this issue May 19, 2020
It was not intentional to simply ignore these characters, and examples/docs/tests all indicate that curly braces were the intended characters. See google#60
Carrotman42 added a commit to Carrotman42/uuid that referenced this issue May 19, 2020
It was not intentional to simply ignore these characters, and
examples/docs/tests all indicate that curly braces were the intended
characters. I've added some test cases to ensure things work.

Fixes google#60
@sazzer
Copy link

sazzer commented Nov 13, 2020

I've just come across this by virtue of the fact that:

  • "921ef402-cd4c-4be6-8483-a00566a43f60" -> Correctly succeeds
  • " 921ef402-cd4c-4be6-8483-a00566a43f60" -> Correctly fails to parse
  • "921ef402-cd4c-4be6-8483-a00566a43f60 " -> Correctly fails to parse
  • " 921ef402-cd4c-4be6-8483-a00566a43f60 " -> Unexpectedly succeeds!

Turns out it's not that it ignores whitespace if present at both ends, but it's just another version of this bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants