Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

if revision is inactive, scale to zero instead of waiting for last pod retention #15161

Merged

Conversation

eddy-oum
Copy link
Contributor

@eddy-oum eddy-oum commented Apr 25, 2024

Fixes #13812

Proposed Changes

  • if revision is inactive, scale to zero instead of waiting for last pod retention

Release Note


…etention

Signed-off-by: eddy-oum <eddy.oum@kakaocorp.com>
Copy link

knative-prow bot commented Apr 25, 2024

Welcome @eddy-oum! It looks like this is your first PR to knative/serving 🎉

Copy link

knative-prow bot commented Apr 25, 2024

Hi @eddy-oum. Thanks for your PR.

I'm waiting for a knative member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow knative-prow bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Apr 25, 2024
@knative-prow knative-prow bot requested review from izabelacg and skonto April 25, 2024 03:02
@dprotaso
Copy link
Member

/ok-to-test

@knative-prow knative-prow bot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 25, 2024
@dprotaso
Copy link
Member

/hold

I'm going on vacation but will want to review this in detail when I'm back in 2 weeks. I made some code changes in this area of the code before and broke some stuff so I want to be cautious here.

cc @ReToCode @skonto for their reviews in the meantime

@knative-prow knative-prow bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 25, 2024
@dprotaso
Copy link
Member

Note with v1.14 coming out this week next release is in July so there's need to rush

@ReToCode
Copy link
Member

If I understand the discussion in #13812 correctly I think this is fine. Dave is now on PTO, as he has done the recent larger rework in that area, let's also wait for his opinion.

@skonto
Copy link
Contributor

skonto commented Apr 26, 2024

Does this change respect the scale down delay or we are already after that period?

@eddy-oum
Copy link
Contributor Author

eddy-oum commented Apr 27, 2024

Does this change respect the scale down delay or we are already after that period?

I think it is the latter; scale down delays are considered here, before applying handleToScaleZero

edit: seems like current code doesn't respect the scale down delay, thx.

@eddy-oum
Copy link
Contributor Author

eddy-oum commented May 1, 2024

I am wondering if I should do the reachability check together with the last pod retention check, instead of returning early, so that unreachable revisions can go through other checks.

edit: never mind the above approach doesn't work.

I think another approach (instead of returning early in handleScaleZero) would be to return 0 in lastPodRetention(pa *autoscalingv1alpha1.PodAutoscaler, cfg *autoscalerconfig.Config) if pa is unreachable.

func lastPodRetention(pa *autoscalingv1alpha1.PodAutoscaler, cfg *autoscalerconfig.Config) time.Duration {
	if pa.Spec.Reachability == autoscalingv1alpha1.ReachabilityUnreachable {
		return 0
	}
	d, ok := pa.ScaleToZeroPodRetention()
	if ok {
		return d
	}
	return cfg.ScaleToZeroPodRetentionPeriod
}

this approach lets the pa go through other checks, especially the scale to zero grace period.

	}, {
		label:         "scale to zero, if revision is unreachable do not wait for last pod retention",
		startReplicas: 1,
		scaleTo:       0,
		wantReplicas:  0,
		wantScaling:   true,
		paMutation: func(k *autoscalingv1alpha1.PodAutoscaler) {
			paMarkInactive(k, time.Now().Add(-gracePeriod))
			WithReachabilityUnreachable(k)
		},
		configMutator: func(c *config.Config) {
			c.Autoscaler.ScaleToZeroPodRetentionPeriod = 10 * gracePeriod
		},
	}, {
		label:         "revision is unreachable, but before deadline",
		startReplicas: 1,
		scaleTo:       0,
		wantReplicas:  0,
		wantScaling:   false,
		paMutation: func(k *autoscalingv1alpha1.PodAutoscaler) {
			paMarkInactive(k, time.Now().Add(-gracePeriod+time.Second))
			WithReachabilityUnreachable(k)
		},
		configMutator: func(c *config.Config) {
			c.Autoscaler.ScaleToZeroPodRetentionPeriod = 10 * gracePeriod
		},
		wantCBCount: 1,

current implementation will fail the second test, but returning 0 in lastPodRetention for unreachable revisions would pass both tests.

What are your thoughts @ReToCode @skonto? Thanks!

@eddy-oum
Copy link
Contributor Author

/retest

@dprotaso
Copy link
Member

/hold cancel

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 30, 2024
Signed-off-by: eddy-oum <eddy.oum@kakaocorp.com>
@knative-prow knative-prow bot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jun 2, 2024
@eddy-oum
Copy link
Contributor Author

eddy-oum commented Jun 2, 2024

/retest

Copy link

knative-prow bot commented Jun 2, 2024

@eddy-oum: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
certmanager-integration-tests_serving_main b4f6a8a link true /test certmanager-integration-tests

Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@dprotaso
Copy link
Member

dprotaso commented Jun 2, 2024

/lgtm
/approve

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Jun 2, 2024
Copy link

knative-prow bot commented Jun 2, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, eddy-oum

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 2, 2024
Copy link

codecov bot commented Jun 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 84.77%. Comparing base (b0dfed2) to head (fe7d6a1).

Additional details and impacted files
@@           Coverage Diff           @@
##             main   #15161   +/-   ##
=======================================
  Coverage   84.76%   84.77%           
=======================================
  Files         218      218           
  Lines       13478    13480    +2     
=======================================
+ Hits        11425    11428    +3     
+ Misses       1686     1685    -1     
  Partials      367      367           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@knative-prow knative-prow bot merged commit 4538823 into knative:main Jun 2, 2024
67 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Non-Active Revisions with ScaleToZeroPodRetention and min-non-active-revisions > 1 is not scaled down
4 participants