Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Preserve order of keys on load and dump #68

Open
shoogle opened this issue Jul 20, 2019 · 2 comments
Open

Preserve order of keys on load and dump #68

shoogle opened this issue Jul 20, 2019 · 2 comments

Comments

@shoogle
Copy link

shoogle commented Jul 20, 2019

Ordinary YAML does not mandate any particular ordering of keys. This means that when a user creates a YAML file, such as this:

key1: First value
key2: Second value
key3: Third value

It is perfectly valid for a YAML dumper to spit out this:

key2: Second value
key1: First value
key3: Third value

Not only does it make a large file more difficult to read and understand, it also creates a false diff:

+key2: Second value
 key1: First value
-key2: Second value
 key3: Third value

It is easy to see how this could obscure a genuine change. Also, if the file is under version control then it will increase the size of a code repository for no good reason.

Possibly the strictYAML parser already preseves ordering (I don't know, I haven't tried it yet). However, my argument is that this should be more than just an implementation detail. I believe that this should be an actual feature of the strictYAML specification, either as a MUST or at least STRONGLY RECOMMENED, and brought to the attention of implementors with a suitable example and justification.

P.S. Thank you for inventing StrictYAML!

@shoogle
Copy link
Author

shoogle commented Jul 20, 2019

It may interest you to learn that PyYAML, the official YAML parser, now provides an option to preserve key order during loading and dumping. The issue was discussed in yaml/pyyaml#110, where I gave the following argument in favour of preservation:

Since the [standard YAML] spec doesn't guarantee an order, that means any order is valid. PyYAML could return dict keys in any arbitrary order [...] and it would still be perfectly consistent with the YAML specification.

In practice, the only ordering that makes any sense is the order in which [the keys] were created, because if they are returned in a different order then the information about which was created first is lost forever. If the user requires any other form of ordering (alphabetical, etc.), then he/she is able to sort the dict themself after it has been returned in creation order. However, if the dict is not returned in creation order then the user can never put it back in creation order (except by a lucky guess).

Even the original YAML specification conceeds that dumpers have to choose an ordering, it just says that people shouldn't rely on them choosing the same ordering. However, it creates problems if dumpers choose different orderings, so it ends up being better just to mandate one order as being correct, and the only ordering that it makes sense to use is the order that the user has already chosen.

@crdoconnor
Copy link
Owner

crdoconnor commented Jul 22, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants