Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revert "abci: change client to use multi-reader mutexes (#6306)" #7106

Merged
merged 1 commit into from Oct 12, 2021

Conversation

tychoish
Copy link
Contributor

@tychoish tychoish commented Oct 12, 2021

This reverts commit 1c4dbe3.

After looking through the (few) recent commits to v0.34x (last two releases) it seems that this is the only possible cause of the observed increased P2P workload and connection churn observed by some chains following upgrade. Everything else that's changed in the v0.34 branch is either related to statesync, or the light client, or logging related, so this is really the only thing that could cause a problem.

My working theory is that by relaxing the mutex and allowing parallelism of some previously serialized operations, some applications may speed up in ways that causes more data to get gossiped, which can impact the network. I haven't found a smoking gun, but I have no other suspects.

@cmwaters
Copy link
Contributor

What's the reasoning behind the reversion? Do you mind adding a description

@tychoish
Copy link
Contributor Author

What's the reasoning behind the reversion? Do you mind adding a description

Done!

Copy link
Contributor

@tac0turtle tac0turtle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you! Looking through abci I am leaning towards query being the culprit due to peer filtering.

@williambanfield
Copy link
Contributor

I'm definitely fine reverting this but am a little sad we don't have better visibility into the problem. I think better metrics and improved testnets will hopefully help us be a bit more sure of problems like this in the future.

Do we have criteria to tell if this worked?

@tychoish
Copy link
Contributor Author

I'm definitely fine reverting this but am a little sad we don't have better visibility into the problem. I think better metrics and improved testnets will hopefully help us be a bit more sure of problems like this in the future.

Agreed. And I think we can (maybe?) delete a lot of the locking of the clients in the future around ABCI++ and work with the RPC/client protocls.

Do we have criteria to tell if this worked?

Not that I know of. I think it may be possible to reproduce the bug in a testnet.

@tychoish tychoish merged commit 34a3fcd into tendermint:master Oct 12, 2021
mergify bot pushed a commit that referenced this pull request Oct 12, 2021
This reverts commit 1c4dbe3.

(cherry picked from commit 34a3fcd)

# Conflicts:
#	abci/client/creators.go
#	abci/client/local_client.go
#	consensus/common_test.go
#	internal/consensus/byzantine_test.go
#	internal/consensus/reactor_test.go
mergify bot pushed a commit that referenced this pull request Oct 12, 2021
tychoish pushed a commit that referenced this pull request Oct 12, 2021
…kport #7106) (#7110)

* Revert "abci: change client to use multi-reader mutexes (#6306)" (#7106)

This reverts commit 1c4dbe3.

(cherry picked from commit 34a3fcd)
tychoish added a commit to tychoish/tendermint that referenced this pull request Oct 13, 2021
yun-yeo added a commit to terra-money/tendermint that referenced this pull request Dec 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S:backport-to-v0.34.x Tell mergify to backport the PR to v0.34.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants