Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup RedundantRelayDecorator for Check and RecheckTx #6232

Closed
3 tasks
ValarDragon opened this issue Apr 29, 2024 · 5 comments · Fixed by #6280
Closed
3 tasks

Speedup RedundantRelayDecorator for Check and RecheckTx #6232

ValarDragon opened this issue Apr 29, 2024 · 5 comments · Fixed by #6280
Labels
type: performance Performance or storage optimizations

Comments

@ValarDragon
Copy link
Contributor

Summary

The RedundantRelayDecorator is a very significant slowdown to CheckTx and RecheckTx, recheckTx being more important due to how consensus-blocking this operation is:

This is a serious chain slowdown vector for all cosmos chains.

Problem Definition

This is a serious slowdown to Consensus due to the Recheck behavior of Comet, but is also a notable extra strain on mempool ingest.

Proposal

Remember that RedundantRelay is something thats there purely to help benefit altruistic relayers. So we do not need to ensure every packet being relayed will pass, just that if the relay will contribute nothing of help, skip it.

Here is the logic: https://github.com/cosmos/ibc-go/blob/main/modules/core/ante/ante.go#L24

We need to make direct logic to ensure:

  • Only check ClientUpdate on CheckTx, not RecheckTx
  • Every other operation should just do a state read to see if the respective PacketId is already handled. And if its not handled, do a write. It should not do the full logic, as that has notable expenses.

This is important for reducing mempool overhead, and the critical consensus delay that IBC txs in the mempool induce.


For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged/assigned
@ValarDragon
Copy link
Contributor Author

Here is the pprof for Osmosis mainnet, restricted to calls that block consensus due to the mempool behavior right now:
image

@damiannolan
Copy link
Member

do all validator nodes run this ante handler? should be unnecessary, right?

I suppose all node operators take whatever binary is distributed as is.

@ValarDragon
Copy link
Contributor Author

ValarDragon commented May 7, 2024

All validators do run this ante handler. There isn't a way to express not running this on a validator atm.

But the redundant relay guarantee is something you'd want on the block proposer

@colin-axner
Copy link
Contributor

colin-axner commented May 8, 2024

Thanks @ValarDragon! I agree with the proposed solution

@damiannolan and I discussed the issue and came up with some extra details and notes

#3636 needs to be fixed, otherwise the current proposed pr would block submission of misbehaviour

We will split up ante handler logic for check tx and recheck tx to minimize necessary logic.

CheckTx should:

  • check proofs
  • check redundancy
  • update state

RecheckTx should:

  • check redundancy
  • update state

That means, we can exclude (from checktx and rechecktx):

  • pruning
  • misbehaviour
  • app callbacks

In rechecktx we can exclude proof checks. We will do this by only executing necessary logic, whereas checktx will make explicit exclusions in the code. MsgUpdateClient should only call UpdateState and MsgRecvPacket should only execute the following code

Action Items:

note: we can do a non breaking adjustment to the patch releases for the update client fix if necessary

We will leave the other msg types as is for now, to focus on getting the update client and recv packet dealt with first. We can update the others in the future as necessary

@colin-axner
Copy link
Contributor

@ValarDragon is there a specific patch release you would want these changes in?

@crodriguezvega crodriguezvega added the type: performance Performance or storage optimizations label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: performance Performance or storage optimizations
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants