Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scheduling v1beta3 #104251

Merged

Conversation

ravisantoshgudimetla
Copy link
Contributor

@ravisantoshgudimetla ravisantoshgudimetla commented Aug 9, 2021

What type of PR is this?

/kind api-change

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Introduce v1beta3 api for scheduler. This version 
- increases the weight of user specifiable priorities.
The weights of following priority plugins are increased
  - TaintTolerations to 3 - as leveraging node tainting to group nodes in the cluster is becoming a widely-adopted practice
  - NodeAffinity to 2
  - InterPodAffinity to 2

- Won't have HealthzBindAddress, MetricsBindAddress fields 

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

KEP: https://github.com/kubernetes/enhancements/pull/2850/

@k8s-ci-robot k8s-ci-robot added kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 9, 2021
@k8s-ci-robot
Copy link
Contributor

@ravisantoshgudimetla: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Aug 9, 2021
@ravisantoshgudimetla
Copy link
Contributor Author

/sig scheduling

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 9, 2021
@ravisantoshgudimetla ravisantoshgudimetla changed the title Scheduling v1beta3 [wip] Scheduling v1beta3 Aug 9, 2021
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 9, 2021
{Name: names.PodTopologySpread, Weight: pointer.Int32Ptr(2)},
// Weight is tripled because:
// - This is a score coming from user preference.
// - Usage of node tainting to group nodes in the cluster is increasing becoming a use-case
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to myself: typo: increasingly becoming

@ravisantoshgudimetla ravisantoshgudimetla changed the title [wip] Scheduling v1beta3 Scheduling v1beta3 Aug 10, 2021
@k8s-ci-robot k8s-ci-robot added area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Aug 10, 2021
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@ravisantoshgudimetla
Copy link
Contributor Author

/retest

Copy link
Contributor

@damemi damemi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ravisantoshgudimetla this needs a release note, but like we talked about offline it's all pretty straightforward to me. The actual changes (increasing the default plugin weights) are fairly small but this sets up v1beta3 nicely for us to get in final changes before promoting to v1.

/cc @ahg-g @alculquicondor
one question I have is do we want to make this beta3 or just go ahead with starting work on v1, along with Abdullah's suggestion for the simplified config (#102303). what's the anticipated release time for v1, and does adding this new beta version add more work/time for us to eventually deprecate?

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Aug 16, 2021
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 4, 2021
Staging changes for scheduler v1beta3 api.
xref: kubernetes/enhancements#2850
Changes in pkg/apis/scheduler reflecting the changes
from staging. The weights of following priority plugins
were increased:
- TaintTolerations
- NodeAffinity
- InterPodAffinity
as they're user facing priorities and they should be more
influential when compared to default plugins which don't
have any user say.

xref: kubernetes/enhancements#2850
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 5, 2021
@ravisantoshgudimetla
Copy link
Contributor Author

/test pull-kubernetes-unit

@alculquicondor
Copy link
Member

Please don't include the fix to PreferNominatedNode in this PR. Create a separate one.

@ravisantoshgudimetla
Copy link
Contributor Author

Please don't include the fix to PreferNominatedNode in this PR. Create a separate one.

Done - #105509

Instead of changing the node names, I'd prefer to move L1343~1357 into t.Run().

Also nodeConfig seems redundant as node can be easily composed using st.Node().Cap()....Obj()

I don't have a strong preference one way or the other. I included your suggestions here - #105509

PTAL @alculquicondor @Huang-Wei

@ravisantoshgudimetla
Copy link
Contributor Author

This is ready for review again.

badRemovedPlugins1.Profiles[0].Plugins.Score.Enabled = append(badRemovedPlugins1.Profiles[0].Plugins.Score.Enabled, config.Plugin{Name: "ServiceAffinity", Weight: 2})

badRemovedPlugins2 := validConfig.DeepCopy()
badRemovedPlugins2.APIVersion = "kubescheduler.config.k8s.io/v1beta3" // hypothetical, v1beta3 doesn't exist
badRemovedPlugins2.APIVersion = "kubescheduler.config.k8s.io/v1beta3" // hypothetical, v1beta3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not hypothetical anymore.

@ravisantoshgudimetla ravisantoshgudimetla force-pushed the scheduling-v1beta3 branch 2 times, most recently from 78f680b to 079a7eb Compare October 6, 2021 21:50
badRemovedPlugins1.Profiles[0].Plugins.Score.Enabled = append(badRemovedPlugins1.Profiles[0].Plugins.Score.Enabled, config.Plugin{Name: "ServiceAffinity", Weight: 2})

badRemovedPlugins2 := validConfig.DeepCopy()
badRemovedPlugins2.APIVersion = "kubescheduler.config.k8s.io/v1beta3" // hypothetical, v1beta3 doesn't exist
badRemovedPlugins2.APIVersion = "kubescheduler.config.k8s.io/v1beta3" // v1beta3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No point having this test anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need it, considering the validConfig points to v1beta2 here

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but you already have a test suite for v1beta3 below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just not about having the same test. It is also about having the apiVersion to validation testing mapping. Say if we forget to copy some logic or files specific to validation, we can surface those errors in appropriate versions.

badRemovedPlugins1.Profiles[0].Plugins.Score.Enabled = append(badRemovedPlugins1.Profiles[0].Plugins.Score.Enabled, config.Plugin{Name: "ServiceAffinity", Weight: 2})

badRemovedPlugins2 := validConfig.DeepCopy()
badRemovedPlugins2.APIVersion = "kubescheduler.config.k8s.io/v1beta3" // v1beta3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also no point for this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it may not add much value, considering v1beta3 is the api we're testing now. It was a copy-paste error. Will remove it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Member

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 7, 2021
@ravisantoshgudimetla
Copy link
Contributor Author

/hold cancel

based on #104251 (review)

/retest

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 7, 2021
@k8s-ci-robot k8s-ci-robot merged commit 9b45983 into kubernetes:master Oct 7, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Oct 7, 2021
@ravisantoshgudimetla ravisantoshgudimetla deleted the scheduling-v1beta3 branch October 7, 2021 18:04
ravisantoshgudimetla added a commit to ravisantoshgudimetla/machine-config-operator that referenced this pull request Dec 10, 2021
All the upgrade candidate nodes in a mcp would be applied
`UpdateInProgress: PreferNoSchedule` taint.
The taint will be removed MCD once the upgrade is complete.
Since kubernetes/kubernetes#104251 landed,
the nodes not having PreferNoSchedule taint will have higher score.
Before the upgrade starts, MCC will taint all the nodes in the
cluster that are supposed to be upgraded. Once the upgrade is
complete since MCD will remove the taint, none of the nodes will
have `UpdateInProgress: PreferNoSchedule` taint. This ensures
the score of the nodes will be equal again.

Why is this needed?
This reduces the pod churn when the cluster upgrade is in progress.
When the non-upgraded nodes in the cluster have `UpdateInProgress:
PreferNoSchedule` taint, they would get lesser score and the pods
would prefer to land onto untainted(upgraded) nodes there by
reducing the chances of landing onto an unupgraded node which
can cause one more reschedule
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
Status: API review completed, 1.23
Development

Successfully merging this pull request may close these issues.

None yet

8 participants