Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: crd provisioner use cache #176

Closed
wants to merge 4 commits into from
Closed

Conversation

sunpa93
Copy link
Collaborator

@sunpa93 sunpa93 commented Sep 18, 2021

What type of PR is this?

/kind feature

What this PR does / why we need it:

  • Add cached client managed by controller manager to crd provisioner to allow cached queries for CRIs in crd provisioner calls.
  • This would help reduce overload on apiserver

Which issue(s) this PR fixes:

Fixes #

Requirements:

Special notes for your reviewer:

Release note:

none

@@ -297,6 +298,7 @@ func (d *DriverV2) StartControllersAndDieOnExit(ctx context.Context) {
}

sharedState := controller.NewSharedState()
sharedState.RegisterCachedClient(mgr.GetClient())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider passing the mgr.GetClient() to the NewSharedState() function instead of through a new interface method.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually meant to insert this line once the controller manager acquires the leader lock so that the cached client is not registered for those that failed to acquire the lock.

Wouldn't it make sense to initialize sharedState first and post-append cached client in that case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nvm. As long as we register sharedState to crdProvisioner only when leader lock is acquired, that should do.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally, sharedState should be fully initialized (including setting the cached client) before being passed to its dependents through their constructor functions. Post-initialization inject of objects in makes it difficult to track down dependencies and make nil-reference bugs possible since it may not be clear at the point of use whether one can assume it was initialized. Consumers of this field have to deal with the possibility that it may be nil, increasing complexity.

I am equally concerned about the RegisterSharedState method on the CrdProvisioner but I don't see a good way of handling this unless we drastically change how the driver and controllers are initialized.

if c.controllerSharedState == nil {
return nil
}
return c.getCachedClient()
Copy link
Collaborator Author

@sunpa93 sunpa93 Sep 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return c.controllerSharedState.GetCachedClient()

namespace string
azDiskClient azDiskClientSet.Interface
namespace string
controllerSharedState *controller.SharedState
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming this field to simply sharedState.

@@ -55,7 +54,7 @@ func (r *ReconcilePod) Reconcile(ctx context.Context, request reconcile.Request)
var pod corev1.Pod
klog.V(5).Infof("Reconcile pod %s.", request.Name)
podKey := getQualifiedName(request.Namespace, request.Name)
if err := r.client.Get(ctx, request.NamespacedName, &pod); err != nil {
if err := r.controllerSharedState.cachedClient.Get(ctx, request.NamespacedName, &pod); err != nil {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider renaming controllerSharedState to simply sharedState.

@edreed
Copy link
Collaborator

edreed commented Sep 20, 2021

I have overall concerns with this approach. While it is convenient and desirable to share resources (i.e. object cache to reduce direct API server calls), The CrdProvisioner is logically part of the driver where the controllers are separate services that just happen to be hosted in the driver executable/process for convenience.

The post-Mandalay plan intends to partition the work done by the controllers (specifically for volume creation and attachment management) across multiple nodes. We may repackage the controllers into a separate image at this point. It may also be possible regardless of packaging that the driver and controllers may no longer coexist in the same process reducing the opportunity for shared state.

It would also be beneficial to replace our current polling in the CrdProvisioner with a watcher based approach to waiting for CRI status updates. The current controller-runtime support for using watches was recently added, requires a separate client with separate cache and not really for use with controllers. (See kubernetes-sigs/controller-runtime#1460.) Each watch appears to also make an explicit call to the API server to setup the watch.

I am of the opinion that the CrdProvisioner should instead use our auto-generated informers to provide a cached client much like what is done in the scheduler extender. We could then use the informers' events to build an event driven wait mechanism for CRI status changes in addition to leveraging the informers' caching. Informer events are based on a multiplexed watch setup reducing overall calls to API server. That is, we only have a single watch per object type and namespace that are used to drive the events rather than creating a new watch for each CRI monitored for status updates.

@abhisheksinghbaghel @landreasyan Do you have any thoughts about this?

@sunpa93
Copy link
Collaborator Author

sunpa93 commented Sep 20, 2021

I am for this opinion on adding an informer and revising the current polling method to use watch events. I think we should add caching to crdProvisioner at one point to avoid excessive api-server calls. And as per post-mandalay plan you just brought up, it wouldn't make sense for the controllers and crdProvisioner to share resources.

And maybe we can set up go channels for informers' event handlers to use to signal the controllerServer calls that volume attachment / detachment / creation / deletion / update has been completed?

@edreed
Copy link
Collaborator

edreed commented Sep 20, 2021

I am for this opinion on adding an informer and revising the current polling method to use watch events. I think we should add caching to crdProvisioner at one point to avoid excessive api-server calls. And as per post-mandalay plan you just brought up, it wouldn't make sense for the controllers and crdProvisioner to share resources.

The informers provide caching and use watches to drive the cache to eventual consistency. The cached objects are accessible through lister interfaces. e.g. AzVolumeAttachmentNamespaceLister:

obj, err := azVolumeAttachmentInformer.Lister().AzVolumeAttachments("azure-disk-csi").Get(attachmentName)

And maybe we can set up go channels for informers' event handlers to use to signal the controllerServer calls that volume attachment / detachment / creation / deletion / update has been completed?

Yes. We could use the informers' event handlers to implement an interface like:

func (w *conditionWatcher) WatchForObjectCondition(ctx context.Context, obj runtime.Object, conditionFunc func (obj runtime.Object) bool) (runtime.Object, error) <- chan

WatchForObjectCondition watches for the condition of the specified object to be satisfied. It returns the updated state of the object, or an error if wait fails due to timeout or other error condition, through the returned channel. The caller is expected to call this function before starting the operation that would result in the object's condition changes.

Internally, conditionWatcher would maintain a map of object types to a map of object namespaced names to a list of pending waiters' condition functions and return value channel. On receiving an event for an object, conditionWatcher looks up pending waiters, executes their condition function and sends the updated object to the watcher if true is returned. It should also respect the deadline specified in context.Context to handle timeouts and send a timeout error on expiry.

@sunpa93
Copy link
Collaborator Author

sunpa93 commented Sep 20, 2021

The informers provide caching and use watches to drive the cache to eventual consistency. The cached objects are accessible through lister interfaces. e.g. AzVolumeAttachmentNamespaceLister:

Oh, I didn't mean to say that we should add an additional caching mechanism other than the one provided by the informer. I meant that 1) necessity of caching and 2) inability for crdProvisioner to use controllers' shared resources gives use the logical justification to add informers to the crdProvisioner.

@edreed
Copy link
Collaborator

edreed commented Sep 20, 2021

And maybe we can set up go channels for informers' event handlers to use to signal the controllerServer calls that volume attachment / detachment / creation / deletion / update has been completed?

One additional point here is that the watch-based wait should be transparent to the controllerServer. I'm not sure it makes sense to expose the watcher result channel to it, if that's what you were getting at.

@sunpa93
Copy link
Collaborator Author

sunpa93 commented Sep 20, 2021

And maybe we can set up go channels for informers' event handlers to use to signal the controllerServer calls that volume attachment / detachment / creation / deletion / update has been completed?

One additional point here is that the watch-based wait should be transparent to the controllerServer. I'm not sure it makes sense to expose the watcher result channel to it, if that's what you were getting at.

I actually meant to say crdProvisioner here sorry. I guess what I was envisioning is not too different from what we have right now. Without too much focus on the details, these are along the lines of what I envisioned.

type CrdProvisioner struct {
    ...
    conditionWatcher conditionWatcher
}

type waitEntry struct {
    conditionFunc func(obj runtime.Object) (bool, error)
    waitChan chan <-(runtime.Object, error)
}

type conditionWatcher struct {
    informerMap map[Type]cache.SharedIndexInformer // maps object type to its informer
    waitMap sync.Map // maps namespaced name to waitEntry
}

func (w *conditionWatcher) WatchForObjectCondition(ctx context.Context, objectName string, conditionFunc func (obj runtime.Object) bool) (runtime.Object, error) {
    waitChan := make(chan (runtimeObject, error), 1)
    entry := waitEntry {
        conditionFunc: conditionFunc,
        waitChan: waitChan
    }
    conditionWatcher.waitMap.Store(qualifiedName, conditionFunc)
    defer conditionWatcher.waitMap.Delete(qualifiedName)

    select {
		case <-ctx.Done():
			errMsg := fmt.Sprintf("context timeout for %v (%s)", reflect.TypeOf(obj), qualifiedName)
            klog.Error(errMsg)
			return nil, Status.Errorf(codes.Internal, errMsg)
        case obj, err := <- waitChan:
            return obj, err
        default:
            return nil, status.Errorf(Aborted, "unexpected error")
        }
}

And as you said, at event handling, we would check the map, evaluate the conditionFunc, and return its value to the stored channel.

And for every crdProvisioner PublishVolume, UnpublishVolume, CreateVolume, DeleteVolume, UpdateVolume calls, it would invoke WathcForObjectCondition to check for desired object conditions.

@edreed
Copy link
Collaborator

edreed commented Sep 20, 2021

I actually meant to say crdProvisioner here sorry. I guess what I was envisioning is not too different from what we have right now. Without too much focus on the details, these are along the lines of what I envisioned.
...

This looks pretty good, but the fact you're doing the wait in WatchForObjectCondition requires that it be called after the operation that will result in the change. In this case, WatchForObjectCondition would need to Get the object after adding the condition function to the map since the object may be been updated in the interim. This is fine, but we would need to deal with the a case where the object doesn't yet exist and ignore a "NotFound" error. This would be reasonably straightforward to implement. If you choose this route, I would recommend renaming the method WaitForObjectCondition instead.

I had originally thought to have the caller wait on the result channel directly, but this means it also has to deal with context cancelation/timeout separately. I like how your solution handles both - it makes it easier to use. One enhancement I can think of to my original proposal would be to have WatchForObjectCondition instead return a conditionWaiter which could wrap waiting on both the result channel and context, e.g.

type conditionWaiter struct {
    objectType string
    objectName string
    watcher conditionWatcher
    resultChan chan <- (runtime.Object, error)
}

func (w *conditionWaiter) Wait(ctx context.Context) (runtime.Object, error) {
    select {
    case <-ctx.Done():
        errMsg := fmt.Sprintf("context timeout for %s (%s)", w.objectType, w.objectName )
        klog.Error(errMsg)
        return nil, Status.Errorf(codes.DeadlineExceeded, errMsg)
    case obj, err := <- w.resultChan:
        return obj, err
    default:
        return nil, status.Errorf(codes.Internal, "unexpected error")
    }
}

func (w *conditionWaiter) Close() {
    w.watcher.waitMap.Delete(w.objectName)
}

func (w *conditionWatcher) WatchForObjectCondition(obj runtime.Object, conditionFunc func (obj runtime.Object) bool) (conditionWaiter, error)

Here's an example of usage:

func (p* CrdProvisioner) CreateVolume(ctx context.Context, ...) (*v1alpha1.AzVolumeStatusParams, error) {
    newAzVolume = &v1alpha1.AzVolume{ Metadata: metav1.ObjectMetadata{ Name: newVolumeName,, Namespace: ns }, ... }

    waiter, err := p.conditionWatcher.WatchForCondition(newAzVolume, func (obj runtime.Object) bool {
        azVol := obj.(*v1alpha1.AzVolume)
        if azVol.Status.Detail != nil {
            return true
        }
        return false
    })
    if err != nil {
        return nil, err
    }
    defer waiter.Close()

    // Create AzVolume CRI here
    ...

    newAzVolume, err = waiter.Wait(ctx)
    if err != nil {
        return nil, err
    }

    // Handle AzError returns here
    ...

    return newAzVolume.Status.Detail.ResponseObject, nil

I am fine with either approach, though.

@edreed edreed mentioned this pull request Sep 23, 2021
4 tasks
@edreed edreed added this to In progress in Mandalay Sep 23, 2021
@sunpa93
Copy link
Collaborator Author

sunpa93 commented Sep 29, 2021

replaced with #199

@sunpa93 sunpa93 closed this Sep 29, 2021
Mandalay automation moved this from In progress to Done Sep 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants