Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Allow inplace update of fields related to deletion during Machine deletion #10589

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

davidvossel
Copy link

Fixes #10588
/area machine

What this PR does / why we need it:

Machine's default to nodeDrainTimeout: 0s, which blocks indefinitely if a pod can't be evicted. We can't change the nodeDrainTimeout in-place from the MachineDeployment or MachineSet after a machine is marked for deletion.

This results in a machine that wedged forever but can't be updated using the top level objects that own the machine.

To fix this, this PR allows fields related to machine deletion to be updated in place even when the machine is marked for deletion.

NOTE - I did not add unit tests yet for this PR. I want confirmation this is an acceptable approve before investing time into testing.

@k8s-ci-robot k8s-ci-robot added area/machine Issues or PRs related to machine lifecycle management cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 10, 2024
@enxebre
Copy link
Member

enxebre commented May 13, 2024

Thanks @davidvossel change makes sense to me. Smoother deletion is actually one of the supporting use cases for inplace propagation. Let's include some unit tests.

See related #5880
#9285

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 13, 2024
@davidvossel
Copy link
Author

Let's include some unit tests.

@enxebre i extended the existing unit test to cover the case of updating a deleting machine.

@sbueringer sbueringer changed the title Allow inplace update of fields related to deletion during Machine deletion 🌱 Allow inplace update of fields related to deletion during Machine deletion May 14, 2024
@enxebre
Copy link
Member

enxebre commented May 20, 2024

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 20, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 44936fae936d0eab3c39b86c432c24c5e199979d

@@ -362,8 +362,21 @@ func (r *Reconciler) syncMachines(ctx context.Context, machineSet *clusterv1.Mac
log := ctrl.LoggerFrom(ctx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just trying to think through various cases where Machines belonging to MachineSets are deleted

  1. MD is deleted

The following happens:

  • MD goes away
  • ownerRef triggers MS deletion
  • MS goes away
  • ownerRef triggers Machine deletion

=> Current PR doesn't help in this scenario, because the MS will already be gone when the deletionTimestamp is set on the Machines. In this case folks would have to modify the timestamps on each Machine individually.

I recently had a discussion with @vincepri, that we should maybe consider changing our MD deletion flow. Basically adding a finalizer on MD & MS, so MD & MS stick around until all Machines are gone. If we would do this, the MS => Machine propagation of the timeouts implemented here would help for this case as well

  1. MD is scaled down to 0

The following happens:

  • MD scales down MS to 0
  • MS deletes Machine

=> This PR helps in this case because the timeouts are then propagated from MS to Machine

  1. MD rollout

The following happens:

  • Someone updates the MD (e.g. bump the Kubernetes version)
  • MD creates a new MS and scales it up
  • In parallel MD scales down the old MS to 0

=> In this scenario the current PR won't help, because the MD controller does not propagate the timeouts from MD to all MS (only to the new/current one, not to the old ones)

I see how this PR addresses scenario 2. Wondering if we want to solve this problem more holistically. (maybe I also missed some cases, not sure)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what's going on... the use case is subtle, but an easy one to get trapped by.

  • a MS is created with the default node drain timeout of (wait forever).
  • The MS needs to scale down to zero (but not be deleted). The intent it to bring this MS back online at some point.
  • The user discovers that the default node drain timeout is blocking the scale down to zero. The user likely only encounters this drain block the first time they scale down to zero because there are typically other nodes available during normal scale down operations which allows PDB to be satisfied.

The outcome is that the user is now trapped. They can't gracefully scale the MS down to zero because the default node drain timeout can't be updated on the machines. So the user is either forced to take some manual action to tear down the machines or delete the MS.

By allowing the node drain timeout to be modified while the machines are marked for deletion, we give the user a path to unblock themselves using the top level api (either MS or MD) rather than mutating individual machines or performing some other manual operation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup got it, and makes sense. I was just saying that there are various cases with MD+MS where a MS is scaled down to zero. The implementation only coverd one of them. But it's fine for me to consider addressing the others in separate PRs. Would be probably just good to open an issue so we can track that (I can do that)

internal/controllers/machineset/machineset_controller.go Outdated Show resolved Hide resolved
…ine deletion

Signed-off-by: David Vossel <davidvossel@gmail.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 30, 2024
@k8s-ci-robot k8s-ci-robot requested a review from enxebre May 30, 2024 14:45
@enxebre
Copy link
Member

enxebre commented May 31, 2024

/lgtm
/assign @sbueringer

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: a7820897401d291e168e73cfc2ea745d5f2c8d87

@sbueringer
Copy link
Member

All good from my side. I would open a follow-up issue once this PR is merged to track further work to get this behavior across all MD workflows (e.g. MD rollout, deletion).

/approve

/hold
In case someone else wants to take a look (@fabriziopandini @chrischdi @vincepri)

Otherwise let's merge in a few days

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 3, 2024
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbueringer

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/machine Issues or PRs related to machine lifecycle management cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

MachineSet Inplace update does not work during machine deletion
4 participants