Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip schemas that don't have CEL rules in NewValidator #111483

Merged
merged 1 commit into from
Jul 28, 2022

Conversation

jpbetz
Copy link
Contributor

@jpbetz jpbetz commented Jul 28, 2022

What type of PR is this?

/kind bug

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #111158 (comment)

Structural schemas are considered valid even if the root object does not explicitly set Type: "Object". This wasn't noticed until the feature gate was enabled by default for CEL and k8s.io/kubernetes/test/integration/apiserver/apply: TestApplyCRDUnhandledSchema started failing consistently because it tests this specific case.

Special notes for your reviewer:

Does this PR introduce a user-facing change?

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 28, 2022
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jul 28, 2022
@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

/sig api-machinery
/triage accepted
/priority important-soon

cc @liggitt @cici37

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Jul 28, 2022
@cici37
Copy link
Contributor

cici37 commented Jul 28, 2022

Found the issue in

return validator(s, true, model.SchemaDeclType(s, isResourceRoot), perCallLimit)

Opened a pr against your branch: jpbetz#3

@cici37
Copy link
Contributor

cici37 commented Jul 28, 2022

Notes:
We have existing validation and test coverage for type required in root level:

case rootLevel:
allErrs = append(allErrs, field.Required(fldPath.Child("type"), "must not be empty at the root"))

And root level type must be object:
if lvl == rootLevel && len(s.Type) > 0 && s.Type != "object" {
allErrs = append(allErrs, field.Invalid(fldPath.Child("type"), s.Type, "must be object at the root"))
}

The failing test is to test "type: array" without any items at all is completely valid:

"openAPIV3Schema": {
"properties": {
"TypeFooBar": {
"type": "array"

So how about instead of adding type missing test case which ideally will not be reached, we could directly fix the currently integration test. The proposed PR: #111504

@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

The failing test is to test "type: array" without any items at all is completely valid:

"openAPIV3Schema": {
"properties": {
"TypeFooBar": {
"type": "array"

The problem here is that the root schema has properties but no type, it it had a type it would look like:

"openAPIV3Schema": { 
 "type": "object",
  "properties": { 
    "TypeFooBar": { 
      "type": "array" 

@k8s-ci-robot k8s-ci-robot added the do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. label Jul 28, 2022
@cici37
Copy link
Contributor

cici37 commented Jul 28, 2022

Thanks for addressing it. I agree we should add handle when type is empty(ideally it should not be hit though). Our existing validation process do validating if items value is missing when type=array or type missing which are bypassed by this integration test since it's calling etcdclient.Put directly(That is why it's been hit).

The problem here is that the root schema has properties but no type, it it had a type it would look like:

"openAPIV3Schema": { 
 "type": "object",
  "properties": { 
    "TypeFooBar": { 
      "type": "array" 

@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

this integration test since it's calling etcdclient.Put directly(That is why it's been hit).

Good eye. Okay, I feel slightly better about this not being a general class of problems now. Thanks for digging in!

@liggitt
Copy link
Member

liggitt commented Jul 28, 2022

this integration test since it's calling etcdclient.Put directly(That is why it's been hit).

Good eye. Okay, I feel slightly better about this not being a general class of problems now. Thanks for digging in!

That is simulating CRDs created via v1beta1, which allowed missing type fields. Those CRDs can be updated via v1, so it's still a scenario that can be encountered.

@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

That is simulating CRDs created via v1beta1, which allowed missing type fields. Those CRDs can be updated via v1, so it's still a scenario that can be encountered.

Oh right. Okay, so our options are:

  1. Do another structural schema validation check when loading CRDs from storage and if it fails, don't add a CEL validator at all, eliminating the need to support v1beta1 schemas (which can't have CEL rules anyway)
  2. Try to harden the code against all valid v1beta1 OpenAPIv3 schemas

(1) would need to be handled both for validation rule registration and default checking. (2) requires test coverage for all the valid v1beta1 schema differences

I'm inclined to work toward (1). Any thoughts?

@liggitt
Copy link
Member

liggitt commented Jul 28, 2022

  • eliminating the need to support v1beta1 schemas (which can't have CEL rules anyway)

the scenario is:

  1. create in antiquity past via v1beta1
  2. fixup enough to pass structural validation and add cel rules via v1

am I reading correctly that there are things that pass structural validation that CEL still chokes on?

@jpbetz jpbetz force-pushed the fix-missing-root-object-type branch from 27c7f60 to 7acb288 Compare July 28, 2022 17:45
@k8s-ci-robot k8s-ci-robot removed the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Jul 28, 2022
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed do-not-merge/contains-merge-commits Indicates a PR which contains merge commits. labels Jul 28, 2022
@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

the scenario is:

  1. create in antiquity past via v1beta1
  2. fixup enough to pass structural validation and add cel rules via v1

am I reading correctly that there are things that pass structural validation that CEL still chokes on?

No, CEL is only choking on things that fail structural validation. I've rewritten this fix to do another structural validation check whenever constructing a cel.NewValidator, we already have the structural validation check check on the CRD create/update operations, but when loading CRDs from disk and preparing for CR handling we didn't do another structural schema validation check, so this PR now adds that.

@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

I've manually verified that TestApplyCRDUnhandledSchema passes with this fix with the feature gate enabled.

@liggitt
Copy link
Member

liggitt commented Jul 28, 2022

  1. When writing a CRD, do we ensure any CRD that contains CEL rules passes structural validation?
  2. Do we limit ourselves to calling cel.NewValidator to CRDs that contain CEL rules?

If both 1 and 2 are true, I'm not sure we need the additional structural validation check

@cici37
Copy link
Contributor

cici37 commented Jul 28, 2022

For 1, yes. Though we do all the validation(structural + cel validation) and save all possible errs back at once. Later we could possible improve it by skipping cel check if other validation err already identified(especially type err).
For 2, yes we use cel.NewValidator to validation cel rules.
I agree with Jordan on don't need another structural schema validation check for CR handling.

  1. When writing a CRD, do we ensure any CRD that contains CEL rules passes structural validation?
  2. Do we limit ourselves to calling cel.NewValidator to CRDs that contain CEL rules?

If both 1 and 2 are true, I'm not sure we need the additional structural validation check

@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

  1. When writing a CRD, do we ensure any CRD that contains CEL rules passes structural validation?
  2. Do we limit ourselves to calling cel.NewValidator to CRDs that contain CEL rules?

If both 1 and 2 are true, I'm not sure we need the additional structural validation check

(2) is false, currently at least. It would be most optimal for for CRDs with no CEL rules to have a dedicated routine that checks for validation rules. cel.NewValidator COULD be modified to check as it goes, but that would be a more involved code change that would only benefitt CRDs with CEL rules, and I suspect by not all that much.

@jpbetz jpbetz force-pushed the fix-missing-root-object-type branch from 7acb288 to 2c4f514 Compare July 28, 2022 18:30
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 28, 2022
@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

Switched this to use a "does the structural schema contain cel validation rules?" check.

Co-authored-by: Cici Huang <cicih@google.com>
@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

I've also opened #111519 to guard the validation codepath. I started looking at it more closely because of this PR. I had been planning to do a change in that code to hide CEL errors if there are schema errors anyway..

@liggitt
Copy link
Member

liggitt commented Jul 28, 2022

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 28, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jpbetz, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 28, 2022
@jpbetz
Copy link
Contributor Author

jpbetz commented Jul 28, 2022

/retest

@liggitt liggitt changed the title Fix CEL validation to always include type=object at root Skip schemas that don't have CEL rules in NewValidator Jul 28, 2022
@cici37
Copy link
Contributor

cici37 commented Jul 28, 2022

/test pull-kubernetes-e2e-gce-ubuntu-containerd

@k8s-ci-robot k8s-ci-robot merged commit f3d90ae into kubernetes:master Jul 28, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.25 milestone Jul 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants