Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't delete a physical id if it was just created #15982

Open
blampe opened this issue Apr 18, 2024 · 6 comments
Open

Don't delete a physical id if it was just created #15982

blampe opened this issue Apr 18, 2024 · 6 comments
Labels
area/engine Pulumi engine kind/enhancement Improvements or new features

Comments

@blampe
Copy link
Contributor

blampe commented Apr 18, 2024

There are situations where Pulumi can plan a Create/Delete that result in a resource being created and then immediately deleted in the same operation. I'm proposing that we detect these situations and no-op the delete.

For example in pulumi/pulumi-kubernetes#2948 a new resource is created and an old one is deleted, but their inputs are such that the operation is effectively an update. This isn't something the provider can detect at diff time, but it is something the engine can detect because the physical ID returned by the provider is unchanged.

Our current behavior seems to assume Create will always return a new physical id distinct from anything we plan on deleting. If the provider's Create returns an id that already exists in our state, we should recognize that a new physical resource was not created so there is nothing "old" to delete.

I can't think of a situation where the current behavior is valid or expected, but if we're concerned about preserving backwards-compatibility we could allow providers to opt-in to this new behavior via Configure.

Edge cases to think through:

  • deleteBeforeReplace - I think this behavior would be unchanged.
  • pending deletes - if we're interrupted between the Create/Delete I don't think we have enough information to determine if the pending delete is "normal" or if it corresponds to one of these problematic cases. We could either: prune pending deletions immediately after a Create completes, or store some additional information like the IDs we had created before we were interrupted.

Related #918
Related #6078
Internal https://pulumi.slack.com/archives/CVC1YFDNX/p1713468828983919?thread_ts=1713303598.172839&cid=CVC1YFDNX

@blampe blampe added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Apr 18, 2024
@EronWright
Copy link
Contributor

EronWright commented Apr 18, 2024

Is this essentially reference counting for ids? We'd persist a map of type+id to ref count (per provider?). Seems to solve the "interruption" case.

@t0yv0
Copy link
Member

t0yv0 commented Apr 18, 2024

We had a lively discussion and I'm going to post some notes here that helped me frame this problem and solution space.
Suppose there are two logical resources A and B that map to the same physical resource R.

There are four possible logical desired states: AB, A, B, and {}.

There are two physical desired states: R and {}.

Desired correspondence:

AB, A, B ==> R
{}       ==> {}

Assuming here that the ID returned by provider from Create call is exactly what can be used to reason about "sameness"
of physical resources.

message CreateResponse {
  string id = 1;                         // the ID of the created resource.
}

Scenario

A -> B transition

Current behavior - pulumi does not identify A and B. Nor does provider. Desired state not achieved - get to {}.

provider.Check(B)
provider.Create(B) # re-creates R but sometimes does not fail idempotent-succeeds
provider.Delete(A) # deletes R

Option 1 (this ticket)

Change the engine to not actually issue provider.Delete() for a URN disappearing from state if one of the
provider.Creates() in deployment has issued that same provider ID as this URN.

Option 2

In the provider state, at the point of provider.Create(B) remember B.id just got created.
At the point of provider.Delete(A), notice that A.id == B.id and decide not to do it because it is fishy.

Option 3

Edit Check() protocol to allow to allocate ID to inputs, if feasible.
Change the engine to act as if an alias was provided for B to point to A's URN, so it changes the plan entirely.
The plan is now:

provider.Check(B)
provider.Update(B)

Eron was pointing out that this is somewhat interesting but not quite suitable since Check requires old inputs to be sent, and we won't know which ones those should be until we decide identity, so this does not quite map nicely.

@EronWright
Copy link
Contributor

EronWright commented Apr 18, 2024

Here's a valid use-case where multiple logical resources refer to the same physical resource. A Pulumi YAML program using "patch" resources, with two ConfigMapPatch resources pointing to a ConfigMap named example:

name: myprogram
runtime: yaml
resources:
  cm1:
    type: kubernetes:core/v1:ConfigMapPatch
    properties:
      metadata:
        name: example
      data:
        foo: bar
  cm2:
    type: kubernetes:core/v1:ConfigMapPatch
    properties:
      metadata:
        name: example
      data:
        biz: bam

Note that the id would be <namespace>/example for both resources.

If cm2 were to be removed, we expect the Delete RPC call to clear the biz field and to not delete the object.

@Frassle
Copy link
Member

Frassle commented Apr 18, 2024

Some related issues:
#918
#9925

Importantly 9925 points out that ID tracking isn't enough if the same resource gets touched by multiple stacks. ID tracking probably is enough for intrastack operations, but I'm not 100% sure that it's a safe change for the engine to start assuming that ID strictly means a unique physical resource. There are resources like command & random that don't really have "physical" resources and could possibly create two distinct resources with the same ID. Even for real physical resources we don't currently mandate that ID alone must be uniquely defining, and a provider may use ID + some of the resources inputs to find the actual instance to operate on.

@justinvp justinvp added area/engine Pulumi engine and removed needs-triage Needs attention from the triage team labels Apr 19, 2024
@t0yv0
Copy link
Member

t0yv0 commented Apr 19, 2024

Eron had some good arguments, and as a result he convinced me and I now think we should not pursue this change as written.

  1. Do not remove the Delete call. Resources are logical entities, and Delete call orchestrates resource removal (which
    may or may not correspond to changes to cloud state - this last bit is always managed by the provider). So depriving
    the provider of a Delete call is not a good idea, providers should be able to react to the entire resource lifecycle.

  2. As of today duplicative or renamed resources are a logical problem at the program level, typically resolved by
    aliases. Bringing providers into this layer cleanly needs some more design thinking.

Here is an alternative proposal for allowing providers to warn (or fail in strict mode) at preview time with the
following message:

Your program defines multiple resources for the sqs.QueuePolicy "examplequeue":
- urn1
- urn2
..

Since these appear to be distinct, Pulumi will attempt to create urn1 and delete urn2 which will lead to deletion of
"examplequeue", which may not be what you want.

To let Pulumi identify these resources as the same resource, add an alias to "urn1" like this:

<example>

To suppress this warning, run with --force.

In order for the provider to have a chance to emit this warning, we need to extend the protocol to allow preview of
Delete calls, the same way that Create and Update calls are now sent with preview bool.

@t0yv0
Copy link
Member

t0yv0 commented Apr 19, 2024

I lifted this to #16004 for separate discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/engine Pulumi engine kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

5 participants