Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stack corrupted by "parent coming after child..." error #10950

Closed
phillipedwards opened this issue Oct 6, 2022 · 4 comments · Fixed by #11272
Closed

Stack corrupted by "parent coming after child..." error #10950

phillipedwards opened this issue Oct 6, 2022 · 4 comments · Fixed by #11272
Assignees
Labels
area/backends State storage (filestate/httpstate/etc.) awaiting-feedback Blocked on input from the author kind/bug Some behavior is incorrect or out of spec p1 Bugs severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed
Milestone

Comments

@phillipedwards
Copy link
Member

What happened?

During an update with a S3 backend stack, an error occurred and state became corrupted. The error is:

pulumi:pulumi:Stack aws-s3-stack running 
pulumi:providers:aws aws-provider-source  
pulumi:providers:aws aws-provider-destination  
pulumi:providers:pulumi default  error: post-step event returned an error: failed to save snapshot: .pulumi/stacks/aws-s3-stack.json: snapshot integrity failure; it was already written, but is invalid (backup available at .pulumi/stacks/aws-s3-stack.json.bak): child resource urn:pulumi:aws-s3-stack::aws-to-primary::pulumi:providers:pulumi::default's parent urn:pulumi:aws-s3-stack::aws-s3-stack::pulumi:pulumi:Stack::aws-s3-stack comes after it
pulumi:pulumi:Stack aws-s3-stack running error: update failed
pulumi:providers:pulumi default **failed** 1 error
pulumi:pulumi:Stack aws-s3-stack **failed** 1 error

Possibly related to: #5577

Steps to reproduce

Unknown...TBD

Expected Behavior

pulumi up succeeds

Actual Behavior

Stack state gets corrupted.

Output of pulumi about

No response

Additional context

No response

Contributing

Vote on this issue by adding a 👍 reaction.
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already).

@phillipedwards phillipedwards added kind/bug Some behavior is incorrect or out of spec needs-triage Needs attention from the triage team labels Oct 6, 2022
@Frassle Frassle added area/backends State storage (filestate/httpstate/etc.) and removed needs-triage Needs attention from the triage team labels Oct 7, 2022
@Frassle
Copy link
Member

Frassle commented Oct 26, 2022

I suspect this isn't actually a problem with the S3 backend itself, but an issue elsewhere.

It looks like somehow the pulumi default provider was recorded to state before it's parent stack resource was. The stack must have been registered first because the step generator asserts that the parent for resources can be found.

It would help to know what language SDK this was using, and if they're doing anything special about how they're creating the stack resource.

@mikhailshilkov mikhailshilkov added the awaiting-feedback Blocked on input from the author label Nov 1, 2022
@lukehoban lukehoban added the p1 Bugs severe enough to be the next item assigned to an engineer label Nov 4, 2022
@lukehoban
Copy link
Member

Additional notes:

  • A user is hitting this "pretty consistently"
  • They are using Go
  • They are using an httpstate backend, so as @Frassle notes above, this does not appear to be limited to S3 backend

@bors bors bot closed this as completed in daad9b6 Nov 8, 2022
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Nov 8, 2022
@lukehoban lukehoban added this to the 0.80 milestone Nov 13, 2022
@rh4ll
Copy link

rh4ll commented Dec 12, 2022

I have experienced a similar thing:

  • in a code branch
  • change parent of resource with children (adding alias)
  • run pulumi up on stack
  • on main branch run pulumi up on stack
  • Get error

Looking at the op's original urn's this looks to be the case here too

@dougludlow
Copy link

dougludlow commented Feb 13, 2024

Just ran into the same issue, with the same steps @rh4ll mentioned. Pulumi version: 3.64.0. I'm pretty sure the issue had to do with the aliases that were added.

In a couple cases, I was able to simply revert to a previous version of the state (pulumi stack export --file out.json --version x, pulumi stack import --file out.json), after removing the new resources manually in AWS. In another case, I had to export the state and fix the json manually.

In all the above cases, I had renamed some resources and added aliases so that they would not be replaced. I deployed my branch to a shared nonprod environment to test the changes and then shortly after the main branch was deployed to the same environment and somehow the resources were duplicated (most likely because the main branch didn't know about the aliases?), but the new resources took on the original urns. At that point the state became corrupt and any attempt to run pulumi up from the main branch resulted in error: resource complete event returned an error: failed to verify snapshot: resource <x>'s dependency <y> comes after it. When I exported the state, the dependency did indeed come after the resource in the json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/backends State storage (filestate/httpstate/etc.) awaiting-feedback Blocked on input from the author kind/bug Some behavior is incorrect or out of spec p1 Bugs severe enough to be the next item assigned to an engineer resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants