New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scanner: Find quoted token from context #327
Conversation
To resolve an issue with quoted map keys not correctly parsing, this PR changes how it progresses after scanning a quote token. The old behaviour was: 1. scanner reaches a : 2. pull the characters out of the current scan context buffer and turn it into the map key token 3. set the previous indent column to the beginning of the buffer So next time it gets to another key in the map, it'll check whether the indent was the same. When scanning a quote: 1. Scan from the beginning to the end of the quote sequence 2. Add a token, send back to the lexer, and reset the scan context In this scenario, it reaches the :, but nothing is in the scan context buffer because it just went ahead and added the quote token. The previous indent column doesn't get reset, so when it gets to the next map key it thinks the indent increased, which is invalid yaml. The new behaviour is essentially to not reset the context after scanning a quote, and when reaching a : check the context tokens, and if it's a quote set the previous indent column to that
Currently, test cases like EDIT: Yes, that is the way to escape single quotes in yaml. That's a new one to me! |
When scanning an escaped single quote `''` there was an off by one error because while the local `idx` variable is increased, the context's `idx` is not. This leads to the progress returned being off by one, so the scanner gets stuck in the `ctx.next()` loop forever, unable to progress.
Resolved the |
When we scan escape characters in quoted scalars, the local idx is shifted but not the underlying context. This adds a column skip to ensure the ctx progresses in sync with the local quote scanning.
`scanQuote` will add to the origin buffer for the purpose of creating the quote token, but would never reset it. This caused a problem when nothing would reset the buffer, i.e. when a quoted value was followed by a newline and the next map key.
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #327 +/- ##
==========================================
- Coverage 75.93% 75.63% -0.30%
==========================================
Files 13 13
Lines 4305 4359 +54
==========================================
+ Hits 3269 3297 +28
- Misses 799 819 +20
- Partials 237 243 +6 |
Okay, all existing tests appear to be passing. Going to add tests to properly cover all the new stuff and catch this potential regression in the future. |
I had adjusted the `:` scanning logic in a way that didn't cover all cases.
If you can think of any other tests cases I'll be happy to add them. |
Thank you for your contribution !! I read the implementation, but it seemed a bit complicated, so I implemented it too |
Made obsolete by a better solution in #328! |
Closes #301
To resolve an issue with quoted map keys not correctly parsing, this PR changes how it progresses after scanning a quote token.
The old behaviour was:
When scanning a quote:
The new behaviour is essentially to not reset the context after scanning a quote, and when reaching a : check the context tokens, and if it's a quote set the previous indent column to that