Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cli: allow node operator to rollback last state #7033

Merged
merged 16 commits into from Oct 8, 2021
Merged

Conversation

cmwaters
Copy link
Contributor

@cmwaters cmwaters commented Sep 30, 2021

Closes: #3845

Adds a simple tendermint rollback command which takes a tendermint state at height n and rebuilds the tendermint state at height n-1, overriding the latter state. In order to recover from diverging app hashes, applications need to also be able to rollback their last state as well. This does not remove the block at height n. Upon restart Tendermint will reexecute the transactions at height n against the application.

I thought about adding a --height flag which would allow operators to rollback tendermint state to a selected height with which to replay all the transactions again but I think hard coding it to the previous height is sufficient

cc @ethanfrey

@cmwaters cmwaters added the S:backport-to-v0.34.x Tell mergify to backport the PR to v0.34.x label Sep 30, 2021
Copy link
Contributor

@alexanderbez alexanderbez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order for this to be used effectively and in conjunction with the SDK, the APIs need to be publicly exposed. I imagine the SDK will actually implement it's own command in which it will call these APIs.

@ethanfrey
Copy link
Contributor

Thank you for this PR. And I think only rolling back 1 block is fine for 90% of use cases, so let's just do that.
Hope this can be back ported to 0.34.x series after a merge.

@ethanfrey
Copy link
Contributor

In order for this to be used effectively and in conjunction with the SDK, the APIs need to be publicly exposed. I imagine the SDK will actually implement it's own command in which it will call these APIs.

This is a very good point.

I will look at SDK rollback, but since the SDK embeds tendermint, it will need to expose a command to revert it's own app state, then call into Tendermint to revert it's last block.

Copy link
Contributor

@ethanfrey ethanfrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work, just a bit of polish and it should work for most cases


// rollbackState takes a state at height n and overwrites it with the state
// at height n - 1. Note state here refers to tendermint state not application state.
func rollbackState(bs state.BlockStore, ss state.Store) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make this function public?
And maybe place it in a different package?

In any case, an sdk command should be able to call this and then do it's own rollback

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to making this be somewhere else and maybe testing writing some more unittests.

"github.com/tendermint/tendermint/version"
)

func TestRollback(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a non-happy path test that we could write?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I'll try think of a few. I think the main non-happy paths are a matter of failure to get data i.e. blocks, state or validator sets. I don't think there's too much of a problem with the atomicity of operations (the write only happens once at the end).


// rollbackState takes a state at height n and overwrites it with the state
// at height n - 1. Note state here refers to tendermint state not application state.
func rollbackState(bs state.BlockStore, ss state.Store) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to making this be somewhere else and maybe testing writing some more unittests.

@cmwaters
Copy link
Contributor Author

In order for this to be used effectively and in conjunction with the SDK, the APIs need to be publicly exposed. I imagine the SDK will actually implement it's own command in which it will call these APIs.

Makes sense. I will make a public Rollback function.

@alexanderbez
Copy link
Contributor

In order for this to be used effectively and in conjunction with the SDK, the APIs need to be publicly exposed. I imagine the SDK will actually implement it's own command in which it will call these APIs.

Makes sense. I will make a public Rollback function.

Thank you! Otherwise, it looks great 👍

cmd/tendermint/commands/rollback.go Outdated Show resolved Hide resolved
cmd/tendermint/commands/rollback.go Show resolved Hide resolved
internal/state/rollback.go Outdated Show resolved Hide resolved
internal/state/rollback.go Outdated Show resolved Hide resolved
internal/state/rollback.go Outdated Show resolved Hide resolved
internal/state/rollback.go Show resolved Hide resolved
internal/state/state.go Show resolved Hide resolved
@@ -0,0 +1,34 @@
package cli
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the libs/cli package is really just helpers for managing/using cobra, and so this is sort of a weird case.

I think this should end up in either:

  • node
  • cmd/tendermint/commands

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think putting it in node makes sense because that should be node-only stuff. I initially had it in cmd/tendermint/commands but this feels also strange because it's not a command, rather some tooling function. Maybe we just have to face that there's probably no idea home for it and leave it in cmd/tendermint/commands.

@cmwaters cmwaters merged commit 4ca130d into master Oct 8, 2021
@cmwaters cmwaters deleted the callum/rollback branch October 8, 2021 07:15
mergify bot pushed a commit that referenced this pull request Oct 8, 2021
(cherry picked from commit 4ca130d)

# Conflicts:
#	CHANGELOG_PENDING.md
#	cmd/tendermint/commands/reindex_event.go
#	cmd/tendermint/main.go
#	state/state.go
#	types/time/rollback.go
#	types/time/rollback_test.go
mergify bot pushed a commit that referenced this pull request Oct 8, 2021
(cherry picked from commit 4ca130d)

# Conflicts:
#	CHANGELOG_PENDING.md
@tac0turtle tac0turtle mentioned this pull request Jul 27, 2022
39 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S:backport-to-v0.34.x Tell mergify to backport the PR to v0.34.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add command to roll-back a single block
7 participants