Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assist users in recovering resources that may have been created despite errors #15958

Open
thomas11 opened this issue Apr 16, 2024 · 3 comments
Labels
area/cli UX of using the CLI (args, output, logs) area/engine Pulumi engine kind/enhancement Improvements or new features

Comments

@thomas11
Copy link
Contributor

thomas11 commented Apr 16, 2024

Hello!

  • Vote on this issue by adding a 馃憤 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

There are cases where Create seemingly fails but the resource is actually created, at least partly. This can be due to network issues or internal issues of the cloud provider. An example is pulumi/pulumi-azure-native#3200, where the Azure response was "504 Gateway timeout". If such resources are explicitly named or are singletons, subsequent Pulumi up's will fail. If they are not, orphaned resources are created.

Currently, such resources are absent from Pulumi state. For cases like the one linked above, it would be useful if they were recorded as "probably failed to create but might exist". Then, on the next up, the engine could prompt the user that they might want to import them, or refresh if that was extended to cover such resources.

@thomas11 thomas11 added kind/enhancement Improvements or new features needs-triage Needs attention from the triage team labels Apr 16, 2024
@justinvp justinvp added area/engine Pulumi engine area/cli UX of using the CLI (args, output, logs) and removed needs-triage Needs attention from the triage team labels Apr 16, 2024
@danielrbradley
Copy link
Member

Discussed this as part of the linked issue above. We do already have the partial state mechanism however there were two factors at play which could cause it not to work:

  1. There was an error while trying to construct the partial state which resulted in a normal error being returned instead of a partial state error.
  2. The provider didn't have cancellation support so would not shut down gracefully and return a partial error.

I think we could re-scope this to returning the partial state immediately after the create has started to avoid situations where there's interruptions causing the partial error not to be returned. This would likely be implemented an additional request that the provider can make to the engine with the initial state from a creation. This would allow the engine to write the partial state as a placeholder in the checkpoint until the final state is recieved - making the partial state creation more reliable and less likely to loose data.

@Frassle
Copy link
Member

Frassle commented Apr 19, 2024

I think we could re-scope this to returning the partial state immediately after the create has started to avoid situations where there's interruptions causing the partial error not to be returned.

#5210?

@danielrbradley
Copy link
Member

@Frassle yes, I think that's the same core issue, but looks like it stalled on the variance between clouds where we can't always know the id up-front. Hence, an optional early back-channel would be a nicer approach, if possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cli UX of using the CLI (args, output, logs) area/engine Pulumi engine kind/enhancement Improvements or new features
Projects
None yet
Development

No branches or pull requests

4 participants