Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CEL runtime cost into CR validation #108482

Merged
merged 1 commit into from Mar 14, 2022

Conversation

cici37
Copy link
Contributor

@cici37 cici37 commented Mar 3, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

This is part of #107573. Based on PR: google/cel-go#494. Add runtime cost calculation of CEL into CR validation.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Added CEL runtime cost calculation into CustomerResource validation. CustomerResource validation will fail if runtime cost exceeds the budget.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Mar 3, 2022
@k8s-ci-robot k8s-ci-robot requested review from dchen1107, lavalamp and a team March 3, 2022 09:51
@k8s-ci-robot k8s-ci-robot added area/code-generation area/dependency Issues or PRs related to dependency changes sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 3, 2022
@leilajal
Copy link
Contributor

leilajal commented Mar 3, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Mar 3, 2022
@leilajal
Copy link
Contributor

leilajal commented Mar 3, 2022

/triage accepted

@fedebongio
Copy link
Contributor

/cc @jpbetz

@k8s-ci-robot k8s-ci-robot requested a review from jpbetz March 3, 2022 17:46
@k8s-ci-robot k8s-ci-robot added sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 3, 2022
@cici37 cici37 changed the title [WIP]Add CEL runtime cost into CR validation Add CEL runtime cost into CR validation Mar 3, 2022
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 3, 2022
@liggitt
Copy link
Member

liggitt commented Mar 11, 2022

other than needing testing of the string-prefix-detection branch (strings.HasPrefix(err.Error(), "operation cancelled: actual cost limit exceeded"), this looks good. can go ahead and squash down.

once there's a test in place that fails if that prefix match doesn't catch a cost-exceeded error, this lgtm

@cici37 cici37 force-pushed the vendorCEL branch 3 times, most recently from adf92b3 to 5420655 Compare March 11, 2022 18:48
@cici37
Copy link
Contributor Author

cici37 commented Mar 11, 2022

go ahead and squash do

The test is in place for the string-prefix-detection branch (strings.HasPrefix(err.Error(), "operation cancelled: actual cost limit exceeded"). It will still pass because all the err got from cel validation will be caught and returned. The only different for this specific err is we wrapped with message call cost exceeds limit for rule. I have updated the message compare in test and now it is working as expected. Thank you for your patience

@cici37
Copy link
Contributor Author

cici37 commented Mar 11, 2022

/test pull-kubernetes-e2e-kind-ipv6

@cici37 cici37 force-pushed the vendorCEL branch 2 times, most recently from e300bd8 to 50b9a0d Compare March 11, 2022 19:57
Copy link
Member

@liggitt liggitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking for this PR, but make sure there's follow up item to make the construction / detection of "budget exceeded" errors consistent with helper constructor/detector functions. Right now we're returning slightly different error messages in four places and detecting them with strings.Contains in at least 2-3 places

t.Errorf("expect err of running out of cost budget but did not find")
}
if meetErr != 3 {
t.Errorf("expect 3 errs of running out of cost budget returned but get %v errs", meetErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is surprising... once we exceed our budget, we should stop evaluating further rules, we should get back at most one "budget exceeded" error

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this requires threading/propagating remainingCost through the default validation function, I'm ok doing this in a follow-up, but it does need to be done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated to handle cel cost budget exceed error separately. Would you mind to check the latest commit to see if it makes sense? Thank you

@cici37 cici37 force-pushed the vendorCEL branch 3 times, most recently from 744967d to 74e3ae2 Compare March 13, 2022 19:35
}

// validate is the recursive step func for the validation. insideMeta is true if s specifies
// TypeMeta or ObjectMeta. The SurroundingObjectFunc f is used to validate defaults of
// TypeMeta or ObjectMeta fields.
func validate(pth *field.Path, s *structuralschema.Structural, rootSchema *structuralschema.Structural, f SurroundingObjectFunc, insideMeta, requirePrunedDefaults bool) (field.ErrorList, error) {
// If CEL validation cost budget exceeded, the error will be saved in returned arg error instead of allErrs. The caller could handle it separately if needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If CEL validation cost budget exceeded, the error will be saved in returned arg error instead of allErrs.

Is that accurate? This seems like it is propagating cost-exceeded errors in allErrs:

if remainingCost < 0 {
	return allErrs, nil, remainingCost
}

as an aside, if we did propagate a cost error in err, that would propagate to callers of ValidateDefaults, and would be handled by this block, which has a comment that we never expect to encounter an error:

} else if validationErrors, err := structuraldefaulting.ValidateDefaults(fldPath.Child("openAPIV3Schema"), ss, true, opts.requirePrunedDefaults); err != nil {
	// this should never happen
	allErrs = append(allErrs, field.Invalid(fldPath.Child("openAPIV3Schema"), "", err.Error()))
} else {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the code (propagating cost errors in allErrs) is actually what we want, and this godoc should change to match what we're actually doing

Copy link
Contributor Author

@cici37 cici37 Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was left over from previous change. I have removed the incorrect comment. Sorry for that

@liggitt
Copy link
Member

liggitt commented Mar 14, 2022

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cici37, liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Mar 14, 2022
@k8s-ci-robot k8s-ci-robot merged commit 866e423 into kubernetes:master Mar 14, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Mar 14, 2022
@cici37 cici37 deleted the vendorCEL branch March 14, 2022 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/dependency Issues or PRs related to dependency changes cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants