Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade yaml.v2 to yaml.v3, improve testing and bugfixes #61

Closed
wants to merge 1 commit into from

Conversation

inteon
Copy link
Member

@inteon inteon commented Sep 11, 2021

fixes: #58
fixes: #47
fixes: #45
fixes: #18

Thorough refactor of this repo, improves tests & tries to solve the most common problems (by upgrading to yaml.v3 and fixing some bugs).
JSONObjectToYAMLObject was removed, but this function does not seem to be used a lot (please correct me if I'm wrong).

This might have to be release as v2 of this go library.

@k8s-ci-robot
Copy link

Thanks for your pull request. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please follow instructions at https://git.k8s.io/community/CLA.md#the-contributor-license-agreement to sign the CLA.

It may take a couple minutes for the CLA signature to be fully registered; after that, please reply here with a new comment and we'll verify. Thanks.


Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. label Sep 11, 2021
@k8s-ci-robot
Copy link

Welcome @inteon!

It looks like this is your first PR to kubernetes-sigs/yaml 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/yaml has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: inteon
To complete the pull request process, please assign dims after the PR has been reviewed.
You can assign the PR to them by writing /assign @dims in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. and removed cncf-cla: no Indicates the PR's author has not signed the CNCF CLA. labels Sep 11, 2021
@lavalamp
Copy link

I think @liggitt is a better person to ask.

@luxas
Copy link

luxas commented Sep 16, 2021

Discussing with @inteon on the Kubernetes slack now (https://kubernetes.slack.com/archives/C0EG7JC6T/p1631778930193700?thread_ts=1631728530.180800&cid=C0EG7JC6T); there is a need to redesign this library more or less completely in a spec-driven way, i.e. write a KEP for what a v2 of this library should look like.

Furthermore, to be consistent with the k8s API machinery, we also need to depend on the same JSON logic from here https://github.com/kubernetes/apimachinery/blob/master/pkg/runtime/serializer/json/json.go#L113-L184, and hence I think creating a sigs.k8s.io/json encoding/json drop-in would be a prerequisite for a v2 of this library.

This again (sigs.k8s.io/json) would need its own KEP, due to a multitude of needs from kubernetes/enhancements#2082, the defaulting KEP, and kubernetes/enhancements#2970, the "strict" validation KEP (work in kubernetes/kubernetes#104433).

I have listed some issues needing resolution in https://github.com/luxas/deklarative/issues (a PoC/MVP repo for this kind of work), have some PoC work of what sigs.k8s.io/yaml v2 could look like in luxas/deklarative#8, and have identified some upstream issues that need fixing in https://github.com/json-iterator/go/issues. But there's more to this.

From the top of my head, these are the specifications:

  • sigs.k8s.io/json
  • sigs.k8s.io/yaml
    • Use YAML 1.2 (e.g. no implicit typing)
    • Most of the specifications of the above json package; hence use sigs.k8s.io/json for marshalling and unmarshalling Go types, to always be consistent with the JSON logic.
    • Disallow all YAML features not allowed in JSON, or convert (?), e.g. non-string keys.
    • Allow configurable sequence indentation level (problem in kyaml, ref yaml.v3 fork maintenance story kustomize#4033), and auto-detection.
    • Return struct errors
    • Make sure "framing" is supported, such that a stream of YAML documents can be split up individually

We need to make sure we agree on a specific subset of the YAML + JSON union that is considered "supported" for Kubernetes, and what is out of scope.

@liggitt
Copy link

liggitt commented Sep 16, 2021

kubernetes/kubernetes#105030 Is relevant to the json bits. It's more driven by an immediate desire to drop json-iterator because of correctness and compatibility reasons.

@luxas
Copy link

luxas commented Sep 16, 2021

Oh, wow, I had no idea that was going on, thanks for the heads up @liggitt, will definitely take a look 👍!

@liggitt
Copy link

liggitt commented Sep 16, 2021

That PR is heavily oriented toward:

  1. Replacing existing use of json-iterator (we only have two configurations we use, case-sensitive and strict)
  2. Avoiding exposing additional API surface area
  3. Using the standard library code as much as possible for compatibility and correctness reasons
  4. Limiting additions to things we think we can reasonably upstream, and eventually collapse back onto the standard decoder

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 22, 2021
@luxas luxas mentioned this pull request Sep 23, 2021
@luxas
Copy link

luxas commented Oct 21, 2021

A heads-up that https://github.com/kubernetes-sigs/json is now available to use (thanks @liggitt!!)

It satisfies the above criteria as it:

  • Is case-sensitive
  • Preserves numbers
  • Collects/returns strict errors completely independently (more like "warnings") from the "normal" decode errors
  • Allows reporting duplicate field / unknown field policy invalidations
  • Supports returning the invalid byte offset for syntax errors; which allows a higher-level library (maybe like this one to report the surrounding context)

Meanwhile, k8s has moved on to using json.Marshal (kubernetes/kubernetes#105466), not json-iter, to fix the json.RawMessage issue described. That is just a note, no action item needed, as this library already uses json.Marshal, good that we now have consistency.

@inteon would you be up for rebasing this PR to use this new json library?

@ncdc
Copy link

ncdc commented Oct 21, 2021

/uncc

@k8s-ci-robot k8s-ci-robot removed the request for review from ncdc October 21, 2021 11:51
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 21, 2021
@inteon
Copy link
Member Author

inteon commented Oct 21, 2021

@luxas I updated the PR, so it now uses the new json library!
Furthermore, I changed the API to match the json library API, eg. added an option to disallow duplicate fields (both DisallowUnknownFields and DisallowDuplicateFields should be supported).
Please let me know what you think, and feel free to suggest changes or add some extra testcases.

Copy link

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a lot of changes going on here that make this a bit hard to review as a piece... is it possible to do each of the following independently:

  1. use sigs.k8s.io/json
  2. switch to yaml.v3
  3. make additions to the surface area exposed by this library to add new features

unmarshal.go Outdated Show resolved Hide resolved
go.mod Outdated Show resolved Hide resolved
marshal.go Outdated Show resolved Hide resolved
@inteon
Copy link
Member Author

inteon commented Oct 23, 2021

@liggitt thanks for the first review, I made some additional changes based on your feedback. Next step will be to split this PR into multiple smaller PRs. Feel free to provide additional feedback. I'll let you know once I've split this PR into multiple smaller PRs, but this might take some time as I'm quite busy right now.

Copy link

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @inteon for your great work 🥇 💯!

This is not a very straightforward change; it needs to be reasoned about quite deeply. I have added some comments on areas where we should still decide a way forward, but it probably requires some kind of broader SIG API Machinery and kyaml review and approval.

I don't really know if I've got the time to write a KEP around this right now, but it'd be great to formalize these things. Meanwhile, maybe we can discuss where we want to go in this PR?

marshal.go Outdated Show resolved Hide resolved
unmarshal.go Outdated Show resolved Hide resolved
unmarshal.go Outdated Show resolved Hide resolved
unmarshal.go Outdated Show resolved Hide resolved
unmarshal.go Outdated Show resolved Hide resolved
unmarshal.go Outdated Show resolved Hide resolved
unmarshal.go Outdated Show resolved Hide resolved
.github/workflows/go.yml Outdated Show resolved Hide resolved
marshal.go Outdated Show resolved Hide resolved
marshal.go Outdated Show resolved Hide resolved
@luxas
Copy link

luxas commented Oct 25, 2021

cc @laverya @fejta @sttts @neolit123 @dims from #26.

As pointed out by @inteon in #26 (comment):

gnostic now also uses yaml.v3: google/gnostic#194

it hasn't been released yet, however, I guess it'll come in a 0.6.x

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Nov 3, 2021
@inteon inteon force-pushed the master branch 2 times, most recently from 84a7f62 to 34069b9 Compare November 3, 2021 20:48
@inteon
Copy link
Member Author

inteon commented Nov 3, 2021

@luxas thanks for the review, I tried to fix all of your comments & rebased the PR on #65.
Please let me know if there is still something that should be changed.

@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Nov 4, 2021
@inteon inteon requested a review from luxas November 4, 2021 16:03
Copy link

@luxas luxas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for rebasing and upgrading! This starts looking really good 🤩

btw, was there a reason for dropping earlier versions of Go in the CI?

yaml.go Outdated Show resolved Hide resolved
yaml.go Outdated Show resolved Hide resolved
yaml.go Outdated
Comment on lines 78 to 96
if jsonTarget.Kind() != reflect.Ptr || jsonTarget.IsNil() {
return nil, fmt.Errorf("provided object is not a valid pointer")
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was the remove / addition of this code snippet intended?

Copy link
Member Author

@inteon inteon Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this check should be a solution for #30

yaml.go Outdated Show resolved Hide resolved
@@ -336,7 +348,7 @@ func convertToJSONableObject(yamlObj interface{}, jsonTarget *reflect.Value) (in
case int64:
s = strconv.FormatInt(typedVal, 10)
case float64:
s = strconv.FormatFloat(typedVal, 'g', -1, 32)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this need to merge to the current master branch too?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a reference to float32 vs float64 (go-yaml/yaml#83) in the original code.

yaml.go Outdated Show resolved Hide resolved
yaml.go Show resolved Hide resolved
yaml.go Outdated Show resolved Hide resolved
yaml.go Outdated Show resolved Hide resolved
"strconv"
"testing"

"github.com/davecgh/go-spew/spew"
yaml "gopkg.in/yaml.v2"
)

/* Test helper functions */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the JSONToYAML and Marshal tests I would expect the sequence indentation style to have changed to wide instead of compact, due to an upstream change. Is that true? As discussed earlier, kustomize carries a fork of go-yaml for that reason, to provide customizability.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That upstream change blocks migration to yaml.v3. We don't want any externally visible change in serialization behavior.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the information, I'll probably create a fork of go-yaml in this repo too & create an upstream PR (if no PR is present yet).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we're open to forking the yaml repo at this point. The kustomize fork is intended to be temporary and they are pursuing options to bring the V3 library back into a compatible state

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the fixes kustomize needs to land in yaml.v3 to restore compatibility are either in go-yaml/yaml#756 or go-yaml/yaml#753

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, but can't we add a temporary fork here too? Seems like the kyaml team has been trying for a long time to get this option added. Would be great if we could not block this PR until upstream has fixed the issue.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, it's not urgent enough to move off yaml.v2 to take on the added uncertainty / maintenance cost of a fork

Copy link
Member Author

@inteon inteon Nov 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could use yaml.v3 for unmarshalling and yaml.v2 for marshalling; that would fix all problems, only downside is having 2 deps

@inteon
Copy link
Member Author

inteon commented Nov 5, 2021

Thanks for the review; looks like we're getting close to the "ideal" solution.
I think the biggest challenge that is still pending is the "sequence indentation style" and planning how to get this merged (maybe in different stages).

PS: the reason I removed the tests for go-versions below 1.17, is because sigs.k8s.io/json only supports that version

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 8, 2021
@k8s-ci-robot
Copy link

@inteon: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 9, 2022
@inteon
Copy link
Member Author

inteon commented Mar 9, 2022

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 9, 2022
@willfindlay
Copy link

Any updates on this? My team needs a fix for #58 :(

@dims
Copy link
Member

dims commented Aug 9, 2022

@inteon doesn't look like there is a possibility of this landing given the time and the situation has not changed ( #61 (comment) )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
9 participants