Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcli sync not terminating at current block height #1427

Closed
redshiftzero opened this issue Sep 14, 2022 · 5 comments · Fixed by #1431
Closed

pcli sync not terminating at current block height #1427

redshiftzero opened this issue Sep 14, 2022 · 5 comments · Fixed by #1431
Assignees
Labels
C-bug Category: a bug
Milestone

Comments

@redshiftzero
Copy link
Member

Describe the bug
There's something going on with the sync. @zbuc noticed that the syncing seemed to slow down around 20k of the 56k to sync to, which I also see. The issue is that the reported height of the chain (via pcli query chain info) is only 20k, but the sync is continuing to run.

To Reproduce

  1. pcli view sync

Expected behavior
Sync to current block height, and exit

Screenshots
Syncs quickly then slowly advances to a sync height that does not exist

@redshiftzero redshiftzero added the C-bug Category: a bug label Sep 14, 2022
@redshiftzero
Copy link
Member Author

This is something we are currently only seeing on testnet.penumbra.zone (i.e. not on testnet preview)

@zbuc zbuc added this to the Testnet #30 milestone Sep 14, 2022
@zbuc zbuc self-assigned this Sep 14, 2022
@zbuc
Copy link
Member

zbuc commented Sep 14, 2022

This seems to be a Tendermint consensus issue, with some other validator getting to a much further block height than the primary validator.

{
  "backfill_blocks_total": "0",
  "backfilled_blocks": "0",
  "catching_up": false,
  "chunk_process_avg_time": "0",
  "earliest_app_hash": "8A8AB388A755978DA01FF054B21D29F7319C2A57D1178E0B6F52F906575817BA",
  "earliest_block_hash": "844FB90A7508B21083282BC949F0B716BE7613FC58F7C97FC3DD46F55CAE8E9F",
  "earliest_block_height": "1",
  "earliest_block_time": "2022-09-13T14:40:39Z",
  "latest_app_hash": "41B3906AA7AB4332A590D7D5C83FC5BB43C291AB76E9640729687DF409ACE591",
  "latest_block_hash": "7F5E5218CD5C0DCE6A3AE3448F45F908BBAC333E0E7BE9DBBE666B724F6B3660",
  "latest_block_height": "21212",
  "latest_block_time": "2022-09-14T20:35:23.221212838Z",
  "max_peer_block_height": "56371",
  "remaining_time": "0",
  "snapshot_chunks_count": "0",
  "snapshot_chunks_total": "0",
  "snapshot_height": "0",
  "total_snapshots": "0",
  "total_synced_time": "0"
}

(note the max_peer_block_height is 56371 whereas query chain info reports Current Block Height 21218)

What's weird is that our validator maintains 94.13% voting share so I'm not sure what's going on. Maybe a bug that will be fixed by going back to 0.34? /cc @hdevalence

@zbuc
Copy link
Member

zbuc commented Sep 14, 2022

or this calculation just isn't the right thing to do:

let latest_known_block_height = std::cmp::max(latest_block_height, max_peer_block_height);

https://github.com/penumbra-zone/penumbra/blob/main/view/src/service.rs#L185

@zbuc
Copy link
Member

zbuc commented Sep 14, 2022

Does it really make sense to try to sync to the max peer block height, from a node reporting a lesser block height? My inclination is "no" and we should change the syncing logic to only try to sync to the last block height reported by the node you're syncing against.

@redshiftzero
Copy link
Member Author

nice find! I agree we should sync to latest_block_height instead. Also it looks like this max_peer_block_height field was added to the status RPC endpoint in the 0.35 tendermint release: tendermint/tendermint#6610 and will be going away as it's part of the abandoned tendermint changes (ref: #1271), so another reason to switch away from using it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants