Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix recursive anchor/alias stack overflow (#353) #360

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

korylprince
Copy link

As stated in Issue #353, the following yaml causes a stack overflow:

key1: &anchor
  subkey: *anchor
key2: *anchor

Having a self-referential anchor that is also referenced outside the anchor causes the parser (particularly nodeToValue) to recurse infinitely. I'm honestly not sure why a simpler self-referential anchor parses successfully, e.g.

key1: &anchor
  subkey: *anchor

In any case, this can be fixed simply by only visiting (e.g. calling nodeToValue on) alias nodes once. Since the only really useful type for self-referential aliases is a pointer to something, this works out. The first call will marshal onto the type, and the rest of the references will be resolved to the same alias pointer.

This PR does the following:

  • Adds a test case for the failing yaml
  • Adds the above logic by adding an aliasSet set field on Decoder and extra checks in nodeToValue
  • Fixes a wrapped nil error that made this issue even harder to debug (errors.Wrapf(nil, ...) prints as "<nil>" for most formatting flags in fmt.Print, despite the error itself not actually being equal to nil)

@goccy
Copy link
Owner

goccy commented Mar 19, 2023

It is not good to have a stack overflow, but I think this is natural for YAML to occur an error.

@goccy goccy added the reviewed label Mar 19, 2023
@korylprince
Copy link
Author

There's a clear use case and Go analog for recursive yaml: recursive structs:

type Key struct {
    SubKey *Key
}

There's cases of this in the wild, and this library currently is unable to parse files like this without this PR or a similar fix.

I hope you'll reconsider merging this. I'd be happy to make any changes you request to get this merged.

@goccy
Copy link
Owner

goccy commented Mar 22, 2023

I see. I understand that there are already cases where it is used. What I am wondering is whether it is correct to support it as a YAML Spec, and if it is explicitly in the Spec, I would like to support it.

This is because the existence of YAML already in use and a decoder that can parse it does not indicate that it exists in Spec.

@codecov-commenter
Copy link

Codecov Report

Merging #360 (a5ced5d) into master (11ad39b) will increase coverage by 0.02%.
The diff coverage is 83.33%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #360      +/-   ##
==========================================
+ Coverage   75.37%   75.40%   +0.02%     
==========================================
  Files          13       13              
  Lines        4545     4550       +5     
==========================================
+ Hits         3426     3431       +5     
  Misses        864      864              
  Partials      255      255              

@korylprince
Copy link
Author

The YAML spec doesn't mention recursiveness at all. Here's the only thing that comes close:

When composing a representation graph from serialized events, an alias event refers to the most recent event in the serialization having the specified anchor. Therefore, anchors need not be unique within a serialization

While it's a bit of a stretch, I could argue that, in the context of parsing yaml nodes to Go structs, "an alias event refers to the most recent event in the serialization having the specified anchor" could be interpreted to mean aliases should all point to the same anchor object (e.g. with a pointer).

From a more practical standpoint, this library already supports some simpler versions of recursion. What's more, surely an uncatchable stack overflow isn't ever the desired behavior over an intuitive behavior (e.g. recursive aliases -> recursive structs).

At the very least it's a safety issue since untrusted yaml could cause the application to crash.

@korylprince
Copy link
Author

I've rebased this PR onto the current master. @goccy is there any chance of getting this (or some form of this) merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants