Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider extending the provider protocol with preview Delete calls #16004

Open
t0yv0 opened this issue Apr 19, 2024 · 2 comments
Open

Consider extending the provider protocol with preview Delete calls #16004

t0yv0 opened this issue Apr 19, 2024 · 2 comments
Labels
kind/enhancement Improvements or new features

Comments

@t0yv0
Copy link
Member

t0yv0 commented Apr 19, 2024

Hello!

  • Vote on this issue by adding a 馃憤 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

Consider adding preview flag to DeleteRequest and emitting Delete calls at preview if the provider indicates it is ready to accept them.

message DeleteRequest {
       bool preview = 4;  // true if this is a preview and the provider should not actually delete.
}

message ConfigureResponse {
    bool supportsDeletePreview = 4;   // when true, the engine should preview delete calls
}

This protocol extension allows providers to participate in checking plans at preview time and emitting warnings to the user when the plans look suspect. This is already possible for Diff, Create and Update calls that are issued at preview. Adding Delete calls to the mix enables on important additional scenario of making suggestions to add an alias where a resource is about to be accidentally deleted because of a lack of alias.

Consider a scenario such as: pulumi/pulumi-aws#1923

Here the user is changing a program to reparent to move a resource (SQS QueuePolicy) and without specifying the alias, the engine sees this change as a removal of one resource and a reintroduction of another similar resource elsewhere. To maximize infra liveness and reduce downtime, the engine will compute a plan that would Create the new QueuePolicy and then Delete the old one. If executed, this plan will actually idempotent-succeed on Create and then Delete the policy leaving the user stranded with pulumi reporting success but the desired plan not being reached.

If the provider was in a position to preview the Delete calls, it would be able to detect this situation where the plan asks for creating and then deleting the same physical SQS queue. The provider is the correct owner for an in-depth understanding of physical resource identity and making inferences of this sort. If it detected this situation it could emit a warning as follows:

Your program defines multiple resources for the sqs.QueuePolicy "examplequeue":
- urn1
- urn2
..

Since these appear to be distinct, Pulumi will attempt to create urn1 and delete urn2 which will lead to deletion of
"examplequeue", which may not be what you want.

To let Pulumi identify these resources as the same resource, add an alias to "urn1" like this:

The interaction would look like this:

sequenceDiagram
    participant Engine
    participant Provider
    Engine->>Provider: Create(preview=True)
    Engine->>Provider: Delete(preview=True)
    Provider->>Engine: Log(LogSeverity=WARNING)

To validate cross-resource invariants the provider would need to maintain internal state across the calls.

Affected area/feature

Alternatives

One downside of this arrangement is that the resource provider sees only a part of the plan, and when multiple provider instances are used the in-process nature of the analysis doesn't let them coordinate. So using multiple explicit providers might defeat the warnings.

If there was a direct way to do cross-resource plan analysis that would be a more natural fit for this. There are competing considerations which system should be responsible:

  • the provider knows best how logical Pulumi resources map to physical infrastructure and therefore which plans are dangerous for the infrastructure
  • the engine knows best what the plan is and has a natural coordinating role and cross-resource and cross-provider view

Perhaps a better alternative here would be an analyzer-style interface to expose cross resource plans for provider-driven analysis; or else there is a way for providers to surface constraints and the engine still being responsible for enforcing them. #15982 discussed using provider-allocated ID as a constraint but there are some technical difficulties making it unreliable; perhaps a way for provider to surface sets of opaque physical ID resources could be used to drive similar analysis on the engine side.

@t0yv0 t0yv0 added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Apr 19, 2024
@EronWright
Copy link
Contributor

Along these lines, maybe the retainOnDelete option should be passed to the provider, to be able to do a logical cleanup within the provider.

@t0yv0
Copy link
Member Author

t0yv0 commented Apr 22, 2024

Perhaps separate ticket for that? Getting good pushback on this issue as it's rightly being pointed out that protocol changes should be well motivated. I think we need some end-user issues lined up and a prototype demonstrating how the proposed extension solves them (thankfully this is easy to prototype).

@justinvp justinvp removed the needs-triage Needs attention from the triage team label Apr 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

3 participants