Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

follow up on #470->#473: no quotes when not absolutely necessary #484

Closed
wants to merge 3 commits into from

Conversation

GerHobbelt
Copy link

  • code includes round trip tests (object -> yaml.dump -> yaml.safeLoad -> must match object exactly)
  • this code was previously mentioned in the comments in #470 fix #473
  • the # comment and : colon ambiguities have been handled by tiny regex checks as these checks require more context than just the scanned char itself, hence not suitable for inclusion in the isPlainSafe function (or its interface has to change but that's another cost).

Post Postum Ponderings 😄

I did this work on a slightly augmented version of js-yaml and ItWorksForMe® 😉 but I still do wonder if there's a YAML spec situation lurking in the shadows that might trip this up after all...

In other words: can anyone come up with test case(s) that break this aggressive reduction of quoted strings?

diberry and others added 3 commits March 10, 2019 05:56
…ll round trip tests pass.

# Conflicts:
#	test/issues/0470.js
# Conflicts:
#	test/issues/0470.js
&& !isWhitespace(string.charCodeAt(string.length - 1));
&& !isWhitespace(string.charCodeAt(string.length - 1))
&& /\s#/.test(string) === false
&& /:(?:\s|$)/.test(string) === false;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All checks should be done via .charCodeAt() for perfomance reasons.

Also, this lines are not descriptive - difficult to understand.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would comments help (apart from the fact that these are otherwise undesirable, see also my comments further below about the new insights about this)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, comments help in such cases. Or descriptive helper function names (like isWhiteSpace, hasCommentPattern)

But regexps should not be used - see any other part of code, there are no regexps.

@@ -296,7 +287,7 @@ function chooseScalarStyle(string, singleLineOnly, indentPerLevel, lineWidth, te
// in case the end is missing a \n
hasFoldableLine = hasFoldableLine || (shouldTrackWidth &&
(i - previousLineBreak - 1 > lineWidth &&
string[previousLineBreak + 1] !== ' '));
string[previousLineBreak + 1] !== ' '));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think it doesn't worth to change indent here if that's not related to your issue.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was eslint --fix (make fix) in action

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

=> don't use eslint --fix :)

There is local make lint with appropriate config, and it does not report error on this line.

@GerHobbelt
Copy link
Author

In other words: can anyone come up with test case(s) that break this aggressive reduction of quoted strings?

Breaking my own stuff

// JSON format for object to dump:
{"":{"$,":{}}}

as that dumps to

: $,:

and on loading that input, js-yaml barfs a hairball due to the , comma in an unexpected place.

@GerHobbelt
Copy link
Author

After reconsidering the YAML spec and everything, I now believe this #470, #473 approach is severely flawed as you either need very context-sensitive checks to prevent it from producing illegal yaml output (under edge conditions, but see the example in the previous comment: you get breakage anyway and that's very much undesirable).

Instead, if you want fewer quoted strings in your yaml output (like I do too), we'd better augment the js-yaml dumper code to produce literal style or folded style string values, cf. https://yaml.org/spec/1.2/spec.html#id2760844

"Odd"/"irregular" keys should be quoted; at least I don't mind if those are.

In short: this pullreq is to be rejected as it introduces more risk of errors for arbitrary, otherwise legal, yaml inputs and is not the proper way forward as folded style and literal style output is meant for this type of requirement: "no or only few quoted strings wanted in the dumped output".

@puzrin
Copy link
Member

puzrin commented Apr 16, 2019

In short: this pullreq is to be rejected as it introduces more risk of errors for arbitrary, otherwise legal, yaml inputs and is not the proper way forward as folded style and literal style output is meant for this type of requirement: "no or only few quoted strings wanted in the dumped output".

Do you mean flow and block styles for collections?

// JSON format for object to dump:
{"":{"$,":{}}}

as that dumps to

: $,:

Great catch! Probably, it would be nice have such test, to protect from "too aggressive" quotes remove.

@puzrin puzrin mentioned this pull request Apr 16, 2019
@GerHobbelt
Copy link
Author

In short: this pullreq is to be rejected as it introduces more risk of errors for arbitrary, otherwise legal, yaml inputs and is not the proper way forward as folded style and literal style output is meant for this type of requirement: "no or only few quoted strings wanted in the dumped output".

Do you mean flow and block styles for collections?

For block scalars. To be precise: https://yaml.org/spec/1.2/spec.html#id2793652
notably the example there: https://yaml.org/spec/1.2/spec.html#id2793888
and the subsequent sections on clipping, chomping, etc. -- I wasn't aware of these chomping, etc. indicators before, but now I think js-yaml should attempt to produce/use those in the dumper so that block and folded style style can be used for values which do NOT end with a \n LF 😄

Copy-pasta example 8.1 from the linked spec above (layout errors here may happen; everyone should refer to the linked spec for the real deal):

- \| # Empty header↓
 literal 
- >1 # Indentation indicator↓
  ·folded 
- \|+ # Chomping indicator↓
  keep
  
- >1- # Both indicators↓
  ·strip
---

===>

%YAML 1.2
---
!!seq [
  !!str "literal\n",
  !!str "·folded\n",
  !!str "keep\n\n",
  !!str "·strip",
]

Great catch! Probably, it would be nice have such test, to protect from "too aggressive" quotes remove.

Another pullreq will be forthcoming this week for dialing up the fuzzy run count from default 100 to 3000: it only takes a few seconds runtime (granted, on my i7 dev box) but that increased number increases the chance everyone running the test rig will stumble across this and similar hard-to-catch failures, as ultimately it was the fuzzy test rig that produced that edge case right there 👍 -- the only trouble being #483, which simply meant the fuzz box only caught onto it every once in a while 😢 .
Of course, dialing up the run count does not guarantee such results, but when chance is all you've got, I like my chances increased. 😉

@GerHobbelt
Copy link
Author

Off Topic:

I'm looking into the yaml/yaml-test-suite and checking if I can make that one fly for js-yaml: it's loaded with edge cases and nasty stuff that's collected from multiple libs.
I've got a basic rig running as part of make test but it's still 141 errors to check out before it's good to go. 🤔

@perlpunk
Copy link

@GerHobbelt regarding https://github.com/yaml/yaml-test-suite
We are collecting results here: http://matrix.yaml.io/
The script we're using is built into the yaml-editor docker image and can be found here: https://github.com/yaml/yaml-editor/blob/master/sbin/js-yaml-json

Would be cool if you want to start using the test suite. Glad for any questions, comments, improvements =)

@GerHobbelt
Copy link
Author

@perlpunk Seen ya (just now 😄); I saw it and was looking for a fast-track quick-and-dirty approach that works for my "messing around with" js-yaml, so I grabbed your /test/*.tml and took it from there in a few iterations.

Should mention: beautiful stuff over at yours but lemme not pollute this issue any further and take that to a new one, as this is really a spin-off of this one.

@GerHobbelt
Copy link
Author

Discussion continues at #485; this pullreq is rejected as far as I'm concerned as it's the wrong approach, going via reducing isPlainSafe(), which is asking for trouble down the road.

@puzrin: maybe mark this one of mine as rejected? Thanks! 👍

@GerHobbelt GerHobbelt closed this Apr 16, 2019
@puzrin
Copy link
Member

puzrin commented Apr 16, 2019

Another pullreq will be forthcoming this week for dialing up the fuzzy run count from default 100 to 3000: it only takes a few seconds runtime (granted, on my i7 dev box) but that increased number increases the chance everyone running the test rig will stumble across this and similar hard-to-catch failures, as ultimately it was the fuzzy test rig that produced that edge case right there +1 -- the only trouble being #483, which simply meant the fuzz box only caught onto it every once in a while cry .
Of course, dialing up the run count does not guarantee such results, but when chance is all you've got, I like my chances increased. wink

Please, don't increase fuzzer's runs count in PR. Long tests are very annoying. Such things can be added via optional ENV var, but should not become a problem for other devs.

@rlidwka
Copy link
Member

rlidwka commented Dec 10, 2020

This was fixed by db3f529. Please let me know if it broke anything.

I think value in colon5: ":C123" in your example doesn't need to be quoted, maybe I've missed something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants