Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync PVC's Status.Phase #351

Merged
merged 1 commit into from
Aug 18, 2023
Merged

Conversation

yuanchen8911
Copy link
Member

@yuanchen8911 yuanchen8911 commented Aug 16, 2023

What this PR does / why we need it:

The current syncer doesn't sync PVC.Status.Phase in a tenant cluster. We have observed that a super cluster's PV and PVC are not Bound while the the tenant cluster's PV and PVC are Bound, which might be caused by bug/race condition in the underlying controllers. It should never happen.

It can be reproduced by deleting a bound PVC in a tenant cluster, recreating the PVC and bound it again. The resulting PV and PVC in the tenant cluster will show Bound and the PV in the super cluster is Released and the PVC is Pending.

The PR updates a tenant cluster's PVC Status.Phase with the super cluster's Status.Phase if the tenant cluster PVC is Bound and the super cluster PVC is not Bound, e.g., Pending, Lost. The other conditions are still handled and reconciled by the PV controller.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 16, 2023
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Aug 16, 2023
@Fei-Guo
Copy link

Fei-Guo commented Aug 16, 2023

"the PV in the super cluster is Released", what does release mean?

Quoted from k8s doc:
"When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted"

@Fei-Guo
Copy link

Fei-Guo commented Aug 16, 2023

I'd rather double check how PV status is changed during your experiment. Note that tenant pv controller handles all pvc/pv state changes. We should be really careful about asking syncer to join the interaction.

@yuanchen8911
Copy link
Member Author

/retest

@yuanchen8911
Copy link
Member Author

yuanchen8911 commented Aug 16, 2023

"the PV in the super cluster is Released", what does release mean?

Quoted from k8s doc: "When a user is done with their volume, they can delete the PVC objects from the API that allows reclamation of the resource. The reclaim policy for a PersistentVolume tells the cluster what to do with the volume after it has been released of its claim. Currently, volumes can either be Retained, Recycled, or Deleted"

Released means a PV that was bound to a PVC is no longer bound to a PVC and the Reclaim policy is Retain. In this case, the PVC's claimRef still has the PVC info. A use case is PersistentVolume on local storage. This is just an example. The point is the super cluster and the tenant clusters PV and PVC's Status can be different for some reason.

In our real case, a super cluster's PVC Status.Phase was Lost, but the tenant cluster's PVC status was still Bound because the tenant cluster's PV claimRef was not empty and the PV controller in the tenant cluster solely relied on the PV's claimRef to reconcile PVC's status, which is not sufficient, in my opinion.

If everything works perfectly, we should never see vPVC is bound while pPVC is not bound. Unfortunately, we did see such cases and was able to reproduce it too. If the case never happens again, the change will be a no-op. So it should be a safe change.

A question I do have is whether we should reconcile a PV's Status.Phase as well.

@christopherhein ?

@yuanchen8911
Copy link
Member Author

I'd rather double check how PV status is changed during your experiment. Note that tenant pv controller handles all pvc/pv state changes. We should be really careful about asking syncer to join the interaction.

Please see my previous comment #351 (comment). The PV controller, which sets PVC Status.Phase and VolumeName based on the PV's claimRef field is insufficient in a virtual cluster setup and can result in inconsistent status between the tenant cluster and super cluster.

@yuanchen8911
Copy link
Member Author

/retest-required

@yuanchen8911
Copy link
Member Author

No idea why pull-cluster-api-provider-nested-test failed with many unrelated errors.

@Fei-Guo
Copy link

Fei-Guo commented Aug 16, 2023

I was thinking the claimRef which contains a UUID should be different from the newly created PVC. I never expected PV phase can change from "Released" to "Bound".

@Fei-Guo
Copy link

Fei-Guo commented Aug 16, 2023

What is the k8s version of your tenant controlplane? btw.

@yuanchen8911
Copy link
Member Author

yuanchen8911 commented Aug 16, 2023

I was thinking the claimRef which contains a UUID should be different from the newly created PVC. I never expected PV phase can change from "Released" to "Bound".

The following is a real case. We have not figured out why the super cluster PVC was marked as Lost while the PV is Bound with the correct claimRef.

Tenant cluster: vPV (with claimRef): Bound,  vPVC (with volumeName): Bound
Super cluster: pPV (with claimRef): Bound,   pPVC (without volumeName): Lost  

The Released case can be reproduced by deleting a bound PVC, recreating the PVC and removing the PV's claimRef and adding volumeName to the new created PVC. The tenant's PV Status.Phase will be Bound->Released -> Available -> Bound. The PVC is Bound->gone->Pending->Bound. The super cluster's PV will remain Released and the PVC is Lost.

Tenant PV

NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS          CLAIM                  STORAGECLASS    
local-pv    128Gi                          RWO            Retain                    Bound             default/pv-test      local-storage   

NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS          CLAIM                  STORAGECLASS    
local-pv    128Gi                          RWO            Retain                     Released       default/pvc-test   local-storage          

NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS          CLAIM                  STORAGECLASS    
local-pv    128Gi                          RWO            Retain                    Available                                         local-storage   
    
NAME        CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS          CLAIM                  STORAGECLASS    
local-pv    128Gi                          RWO            Retain                    Bound             default/pv-test      local-storage   

Tenant PVC

NAME       STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS    
pvc-test    Bound                                                                            local-storage

NAME       STATUS   VOLUME   CAPACITY   ACCESS MODES   STORAGECLASS  
pvc-test    Pending                                                                             local-storage

NAME       STATUS   VOLUME              CAPACITY   ACCESS MODES   STORAGECLASS    
pvc-test   Bound    local-pv                   128Gi                     RWO            local-storage  

@Fei-Guo
Copy link

Fei-Guo commented Aug 16, 2023

Thanks for the clarification. This makes much more sense. I didn't think about this workflow while I was validating whether pvc/pv works before. I think "removing the PV's claimRef" has not been populated down to pPV if I recall. We only have uws for pv, no dws for pv. If we have populated removing claimRef down to pPv, do you think the problem will be gone?

@yuanchen8911
Copy link
Member Author

yuanchen8911 commented Aug 17, 2023

Thanks for the clarification. This makes much more sense. I didn't think about this workflow while I was validating whether pvc/pv works before. I think "removing the PV's claimRef" has not been populated down to pPV if I recall. We only have uws for pv, no dws for pv. If we have populated removing claimRef down to pPv, do you think the problem will be gone?

Fei, in both cases above, the PVs in the super cluster had claimRef, it's their volumeName in the PVCs that were missing. So a DWS claimRef may not help. Unless we populate the PVC's volumeName downward, which is something we may want to avoid.

The PR does not intend to automatically resolve the issue, as this would necessitate substantial effort. Instead, the objective is to address situations where statuses are inconsistent, particularly when a tenant cluster shows a normal condition (Bound), while the underlying super cluster does not. In such cases, we will adjust the tenant cluster's status to 'not Bound'. This will allow users/administrators to investigate and fix the issue offline. Under no circumstances should there be a scenario where the tenant cluster's PVC is bound while the super cluster is not. This represents the primary goal of the PR. I hope it makes sense.

it's a great discussion. Thanks!

@Fei-Guo
Copy link

Fei-Guo commented Aug 17, 2023

"The PR does not intend to automatically resolve the issue," I agree this is a subtle problem and you are trying to provide mitigation. In this case, would you please add a feature gate at least for now so this would not cause any side effect for other users?

@yuanchen8911
Copy link
Member Author

yuanchen8911 commented Aug 17, 2023

"The PR does not intend to automatically resolve the issue," I agree this is a subtle problem and you are trying to provide mitigation. In this case, would you please add a feature gate at least for now so this would not cause any side effect for other users?

Good idea! I've added a feature gate SyncTenantPVCStatusPhase .

    // SyncTenantPVCStatusPhase is an experimental feature that enables the syncer to
    // update the Status.Phase of a tenant cluster's PVC if it is Bound,
    // but the corresponding PVC in the super cluster is not Bound, e.g., Lost.
    // Although rare, this situation can arise due to potential bugs and race conditions.
    // This feature allows users to perform separate investigation and resolution.
    SyncTenantPVCStatusPhase = "SyncTenantPVCStatusPhase"

@Fei-Guo
Copy link

Fei-Guo commented Aug 17, 2023

Can you please fix the lint error (may not due to your change though) and the test failure?

@yuanchen8911 yuanchen8911 force-pushed the pcv-sync branch 3 times, most recently from 2b295a4 to 7ddde5a Compare August 17, 2023 18:19
@yuanchen8911
Copy link
Member Author

Can you please fix the lint error (may not due to your change though) and the test failure?

Fixed it.

@Fei-Guo
Copy link

Fei-Guo commented Aug 17, 2023

The latest commit does not seem to be the correct one (no feature gate). Can you double check?

@yuanchen8911
Copy link
Member Author

The latest commit does not seem to be the correct one (no feature gate). Can you double check?

Thanks for catching it. My bad. I checked in a wrong version.

@yuanchen8911
Copy link
Member Author

yuanchen8911 commented Aug 17, 2023

/retest-required

@yuanchen8911
Copy link
Member Author

/retest

@yuanchen8911
Copy link
Member Author

Some strange errors. I'm looking into it.

Fix issues in nameing and format

Fix klog import issue

Update go.mod and go.sum

Revert "Update go.mod and go.sum"

This reverts commit fd7604abcf3cb81e3bf75fbc6ca166e1bc9bc34a.

Fix go.mod and go.sum

Add featuregate

Fix lint error

Fix lint errors
@yuanchen8911
Copy link
Member Author

@Fei-Guo , @christopherhein , all tests passed. PTAL. Thanks!

@Fei-Guo
Copy link

Fei-Guo commented Aug 18, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 18, 2023
@christopherhein
Copy link
Member

/lgtm
/approve

🎉🎉

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: christopherhein, yuanchen8911

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 18, 2023
@k8s-ci-robot k8s-ci-robot merged commit 7a2b99e into kubernetes-sigs:main Aug 18, 2023
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants