Skip to content
This repository has been archived by the owner on Apr 18, 2024. It is now read-only.

Commit

Permalink
replay: add debug option to skip corrupted messages in WAL replay
Browse files Browse the repository at this point in the history
DebugUnsafeReplayRecoverCorruptedWAL can be used to configure the WAL
replay to automatically try recovering from corrupted WAL. Replay will
drop remaining messages if a data corruption error is encoutered.

The behaviour should be similar to the one Tendermint instructs user
to do manually in case of a corrupted WAL:
https://github.com/tendermint/tendermint/blob/48f073d796ca4d1063b5f02c5d1b35c2c7e77afc/consensus/state.go#L312-L325
  • Loading branch information
ptrus authored and kostko committed Jan 28, 2021
1 parent db982fa commit fc4f854
Show file tree
Hide file tree
Showing 2 changed files with 9 additions and 0 deletions.
4 changes: 4 additions & 0 deletions config/config.go
Expand Up @@ -860,6 +860,10 @@ type ConsensusConfig struct {
PeerQueryMaj23SleepDuration time.Duration `mapstructure:"peer_query_maj23_sleep_duration"`

DoubleSignCheckHeight int64 `mapstructure:"double_sign_check_height"`

// If set, replay will try to recover from a corrupted WAL error by stopping
// WAL replay after encoutering a corrupted message.
DebugUnsafeReplayRecoverCorruptedWAL bool `mapstructure:"debug_unsafe_replay_recover_corrupted_wal"`
}

// DefaultConsensusConfig returns a default configuration for the consensus service
Expand Down
5 changes: 5 additions & 0 deletions consensus/replay.go
Expand Up @@ -149,6 +149,11 @@ LOOP:
break LOOP
case IsDataCorruptionError(err):
cs.Logger.Error("data has been corrupted in last height of consensus WAL", "err", err, "height", csHeight)
if cs.config.DebugUnsafeReplayRecoverCorruptedWAL {
cs.Logger.Debug("skipping corrupted WAL")
// Ignore data corruption error.
break LOOP
}
return err
case err != nil:
return err
Expand Down

0 comments on commit fc4f854

Please sign in to comment.