Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drpolicy controller define #200

Merged
merged 1 commit into from Aug 20, 2021

Conversation

hatfieldbrian
Copy link
Collaborator

- vrg and pv cluster roles manifestworks unify
- drpolicy create/update:
	- finalizer add if absent
	- cluster roles manifestwork create, if absent, for each cluster in drpolicy
- drpolicy delete:
	- cluster roles manifestwork delete for each cluster not in any other drpolicy
	- finalizer remove if present
- ginkgo version increase from 1.16.1 to 1.16.4
- gomega version increase from 1.11.0 to 1.15.0

Signed-off-by: bhatfiel bhatfiel@redhat.com

case true:
log.Info("create/update")

controllerutil.AddFinalizer(drpolicy, finalizerName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to check if the finalizer exists.

if !controllerutil.ContainsFinalizer(drpolicy, finalizerName) {
		controllerutil.AddFinalizer(drpolicy, finalizerName)
                if err := r.Update(ctx, drpolicy); err != nil {
			return ctrl.Result{}, fmt.Errorf("finalizer add update: %w", err)
		}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddFinalizer checks if it exists before adding it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but I am talking about the whole block. Your reconciler can be called several times, and it will try to update and createRoles several for as many times as it is called for as long as the object is not being deleted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand: an optimization to skip the Client.Update() if finalizer already present.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, actually more than that, an infinite loop!

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimization to update only if finalizer added has been added. Role creation seems optimal already.

return ctrl.Result{}, fmt.Errorf("cluster roles delete: %w", err)
}

controllerutil.RemoveFinalizer(drpolicy, finalizerName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RemoveFinalizer checks if it exists before removing it

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optimization to update only if finalizer removed has been added.

var clusterRolesMutex sync.Mutex

func (mwu *MWUtil) ClusterRolesCreate(drpolicy *rmn.DRPolicy) error {
clusterRolesMutex.Lock()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do you need a lock? ClusterRoles are created from the reconciler and I believe that thread safe

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my understanding that a hub can have multiple DRPolicies each with at most one reconciler running at a time. The lock serializes concurrent creates, deletes, or creates and deletes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I am not sure what the lock is protecting. I have to think about it. Ignore my comment for now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suppose drpolicyA specifies cluster1 and is being deleted, while drpolicyB also specifying cluster1 is being created. drpolicyA's reconciler reads all drpolicies and finds no other drpolicies include cluster1 because drpolicyB has not yet been created, so it determines cluster1's roles can be deleted. But before it does so, drpolicyB is created and its reconciler determines cluster1's roles need not be created because it already exists, then drpolicyA's reconciler deletes the roles that drpolicyB expects to exist. The mutex makes the read-modify-write operation atomic.

for i := range manifestworks.Items {
manifestwork := &manifestworks.Items[i]
if manifestwork.ObjectMeta.Name == ClusterRolesManifestWorkName {
*clusterNames = clusterNames.Insert(manifestwork.ObjectMeta.Namespace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What the heck this is doing? I can wrap my head around it. It inserting a cluster name (which is MW namespace) into a map of strings but the return value of the Insert is a string which is assigned to *clusterNames???

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It outputs a set of namespace names, i.e. cluster names, that have a manifest work named "ramendr-roles". It uses kubernetes' sets.String which is a set of strings, implemented as a map with string keys and empty values.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. So I was looking at this Insert. So you are not using that one then??? I can double-check, I am just being lazy.
https://github.com/kubernetes/apimachinery/blob/v0.22.0/pkg/util/sets/string.go#L49

Copy link
Member

@BenamarMk BenamarMk Aug 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah got it. The confusion the name String in the return value that threw me off. I thought it is a simple string, but it is typedefed to a map.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This routine is currently only used by unit test to confirm that only expected cluster roles manifestworks exist.

controllers/util/mw_util.go Outdated Show resolved Hide resolved
@hatfieldbrian hatfieldbrian force-pushed the drpolicy_controller branch 2 times, most recently from 79498b0 to 208827d Compare August 13, 2021 07:00
@@ -595,7 +597,7 @@ func verifyVRGManifestWorkCreatedAsPrimary(managedCluster string) {
return err == nil
}, timeout, interval).Should(BeTrue())

Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(Equal(2))
Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(BeNumerically(">=", 2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the "ramendr-vrg-roles" count exact? So it should be 2 per managed cluster right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The vrg and pvc cluster roles and role bindings are now in the same manifestwork (to simplify management since they have the same lifetime) for a total of 4 manifests.

Copy link
Member

@BenamarMk BenamarMk Aug 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, test against 4 then. IOW, if the len of Manifests is 6, is that OK? But the expectation will succeed.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think more manifests are ok. This particular test applies only the first two manifests and then verifies a vrg can be created.

type DRPolicyReconciler struct {
client.Client
Scheme *runtime.Scheme
// Log logr.Logger
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed!

finalizerCount := len(drpolicy.ObjectMeta.Finalizers)
controllerutil.AddFinalizer(drpolicy, finalizerName)

if len(drpolicy.ObjectMeta.Finalizers) != finalizerCount {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(this is kind of nit) Can we stick to one pattern when adding and checking finalizers? The following is kind of the same pattern that the DRPC and the VRG are using:

if !controllerutil.ContainsFinalizer(drpolicy, finalizerName) {
    controllerutil.AddFinalizer(drpolicy, finalizerName)
    client.Update(ctx, drpolicy)
    ...
}

I am not saying this is the same pattern, but I am suggesting sticking to one.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I chose this method because len() of a slice seems to be a constant time operation, whereas Add, Remove, and Contains Finalizer are each linear. It'd be nice if Add and Remove would return a boolean.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that performance is an issue here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to suggest we stick to the ContainsFinalizer pattern, not because this is incorrect, it is just devaiant from the pattern, and any fixes/updates to controller-util should retain the interface if we follow the pattern. (and as stated this is a nit)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. I chose this method because len() of a slice seems to be a constant time operation, whereas Add, Remove, and Contains Finalizer are each linear. It'd be nice if Add and Remove would return a boolean.

Pull request to address this: kubernetes-sigs/controller-runtime#1636

@@ -74,6 +74,15 @@ type DRPolicy struct {
Status *DRPolicyStatus `json:"status,omitempty"`
}

func (drpolicy *DRPolicy) ClusterNames() []string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this out of the *types.go file

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems, to me, like a logical place for it. Is it a matter of not including it in an external interface? Users may also find it useful and it seems trivial to support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct. Put that function in the implementation file.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It has been moved to ramen's private util package.

@hatfieldbrian hatfieldbrian force-pushed the drpolicy_controller branch 3 times, most recently from 3408cea to a935e72 Compare August 14, 2021 22:01

return clusterNames
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added two similar DRPolicy methods, which raised the following best-practice related questions:

  1. Benamar: Should the *types.go file contains implementation other than the Init function?
  2. Shyam felt that DRPolicy methods are in deepCopy file and did not want any repurcussions due to more methods getting added to the API itself.

Given the above concerns, I changed them from methods to helper functions and placed them in util/drpolicy_util.go.

@@ -595,7 +597,7 @@ func verifyVRGManifestWorkCreatedAsPrimary(managedCluster string) {
return err == nil
}, timeout, interval).Should(BeTrue())

Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(Equal(2))
Expect(len(createdVRGRolesManifest.Spec.Workload.Manifests)).To(BeNumerically(">=", 2))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not clear why the ">=" is required instead of "==". It may help to comment the expectation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benamar raised this issue too: #200 (comment). The code immediately following this only references the first two manifests. Perhaps the pv roles manifests ought to be tested like the vrg ones are.

@@ -760,13 +762,13 @@ var _ = Describe("DRPlacementControl Reconciler", func() {
verifyUserPlacementRuleDecision(userPlacementRule.Name, userPlacementRule.Namespace, WestManagedCluster)
verifyDRPCStatusPreferredClusterExpectation(rmn.FailedOver)
verifyVRGManifestWorkCreatedAsPrimary(WestManagedCluster)
Expect(getManifestWorkCount(WestManagedCluster)).Should(Equal(4)) // MWs for VRG+ROLES+PVs
Expect(getManifestWorkCount(WestManagedCluster)).Should(Equal(3)) // MWs for VRG+ROLES+PVs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) This commit is not a related to this PR and not addressed to @hatfieldbrian. When we get a chance in future, I suggest that instead of using hard-coded values, can we use a variable that tracks how many manifest work is expected? The variable could be incremented when new mw is created and decremented when a mw is deleted.

It("should create a cluster roles manifest work for each cluster added", func() {
})
It("should delete a cluster roles manifest work for each cluster removed", func() {
})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these two Its to be implemented later? If yes, a TODO comment may help.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO added

Expect((&ramencontrollers.DRPolicyReconciler{
Client: k8sManager.GetClient(),
Scheme: k8sManager.GetScheme(),
// Log: ctrl.Log.WithName("controllers").WithName("DRPolicy"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why comment out the log "DRPolicy" qualifier? Won't the qualifier be useful when the log is output during test failures?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These log qualifiers are added in DRPolicyReconciler.Reconcile() instead which is common to main.go and _test.go instances.

@hatfieldbrian hatfieldbrian force-pushed the drpolicy_controller branch 2 times, most recently from e4b57a2 to f5c91be Compare August 17, 2021 08:41
@hatfieldbrian
Copy link
Collaborator Author

A bug in drpolicy delete unit test was fixed and it now passes.


if err := manifestWorkUtil.ClusterRolesCreate(drpolicy); err != nil {
return ctrl.Result{}, fmt.Errorf("cluster roles create: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned here (https://github.com/RamenDR/ramen/blob/main/controllers/drplacementcontrol_controller.go#L332-L335), it may be better to incorporate the task of validating the schedule in DRPolicy instead of DRPC.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Raghu. #201 tracks this issue.

Copy link
Member

@ShyamsundarR ShyamsundarR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hatfieldbrian rebase to main and any comments that you can tackle are good to get this merged.

config/samples/kustomization.yaml Outdated Show resolved Hide resolved
config/rbac/role.yaml Show resolved Hide resolved
finalizerCount := len(drpolicy.ObjectMeta.Finalizers)
controllerutil.AddFinalizer(drpolicy, finalizerName)

if len(drpolicy.ObjectMeta.Finalizers) != finalizerCount {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going to suggest we stick to the ContainsFinalizer pattern, not because this is incorrect, it is just devaiant from the pattern, and any fixes/updates to controller-util should retain the interface if we follow the pattern. (and as stated this is a nit)

controllers/util/mw_util.go Outdated Show resolved Hide resolved
@hatfieldbrian hatfieldbrian force-pushed the drpolicy_controller branch 4 times, most recently from 46b229f to 8267b48 Compare August 20, 2021 10:30
	- drpolicy create/update:
		- finalizer add if absent
		- cluster roles manifestwork create, if absent, for each cluster in drpolicy
	- drpolicy delete:
		- cluster roles manifestwork delete for each cluster not in any other drpolicy
		- finalizer remove if present
	- gomega version increase from 1.14.0 to 1.15.0

Signed-off-by: bhatfiel <bhatfiel@redhat.com>
Copy link
Member

@ShyamsundarR ShyamsundarR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hatfieldbrian looks like CI is failing due to go.sum updates

@BenamarMk or @vdeenadh if you can take a look and add an approval we can merge this.

Copy link
Member

@BenamarMk BenamarMk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ShyamsundarR already approved. We'll merge once the build is complete

@BenamarMk BenamarMk merged commit 6867e44 into RamenDR:main Aug 20, 2021
@hatfieldbrian hatfieldbrian deleted the drpolicy_controller branch August 20, 2021 11:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants