Reached max retries querying for block #1402

gaia · 2024-02-20T16:25:59Z

on v2.5.1 and at least also v2.5.0, i have a synced osmosis and celestia private RPCs being queried but showing Reached max retries querying for block even for very recent blocks (in addition to having at least 2 weeks before pruning)

The text was updated successfully, but these errors were encountered:

gaia · 2024-02-29T15:32:49Z

Issue is the same on https://github.com/cosmos/relayer/releases/tag/v2.5.2

jtieri · 2024-02-29T16:34:41Z

hey thanks for opening the issue!

we aren't seeing the same behavior in our infra. is there anything unique about your setup, perhaps a load balancer or the need to use port forwarding, etc? could you also try configuring some of the publicly available endpoints from the chain registry to rule out this being an issue with your nodes?

gaia · 2024-02-29T17:19:53Z

no load balancers. I'm running

I tried using all external RPCs, and I still get the same error. But as before, only on celestia<>osmosis (while cosmoshub<>osmosis works fine, using the same osmosis RPC)

Reached max retries querying for block, skipping {"chain_name": "celestia", "chain_id": "celestia", "height": 893372} && warn Reached max retries querying for block, skipping {"chain_name": "osmosis", "chain_id": "osmosis-1", "height": 14058148} (note the recent blockheights). This error is intermittent: it's not shown sequentially for every single block. After a while, it starts to only happen on Celestia (local or 3rd party RPC)

does rly establish a connection in which it needs an inbound port? or websockets? it's behind NAT at the router and NAT at the hypervisor (LXC/LXD)

PS: I can establish a websocket connection to a 3rd party using websocat fine

Would you mind giving me the exact query it is trying to do so that I can try it manually?

jtieri · 2024-04-23T21:34:23Z

the log you shared does have chain_name as celestia so it would seem that the Celestia RPC is the problematic one here. when you start the relayer are you using the debug flag -d? mostly asking to see if there are some details related to the error that are going unseen. i do remember an issue someone reported where the relayer was unable to sync with Celestia and it was due to some configuration on the node, see #1383

the relayer does not use websockets, it just makes RPC calls to the configured node

if i'm not mistaken the logs you are seeing are related to the block_results RPC endpoint

gaia · 2024-04-24T17:12:36Z

thanks, i will look into it again.

PS: port forwarding IS in use. there is NAT at the router to the public IP and also in the LAN IP of the host (since the relayer runs in an LXC container)

jtieri · 2024-04-30T19:32:26Z

let me know what you turn up!

I'm thinking this is possibly related to some silent error or issue that is only being logged at the debug level related to the Celestia node, which could be stemming from some configuration value that is specific to Celestia. From what you described i don't think there is anything wrong with your relayer/node setup necessarily. Perhaps @agouin can take a peek at this and confirm that the system configuration you are using is fine?

gaia · 2024-04-30T20:50:47Z

i will run on rly again in the near future and report back. for now I am running hermes.
you can however use our rpc node for testing. i can send you some TIA.

jtieri · 2024-05-06T19:41:00Z

i will run on rly again in the near future and report back. for now I am running hermes. you can however use our rpc node for testing. i can send you some TIA.

appreciate it! yeah if you wanna share your node i would be happy to try debugging this a bit when i have some extra cycles

gaia · 2024-05-07T15:51:58Z

i will run on rly again in the near future and report back. for now I am running hermes. you can however use our rpc node for testing. i can send you some TIA.

appreciate it! yeah if you wanna share your node i would be happy to try debugging this a bit when i have some extra cycles

happy to share. send me a DM on twitter (@wholesum), you are @Ethereal0ne, right?

jtieri · 2024-05-21T18:13:16Z

The team did some debugging with your Celestia node between Celestia<>Osmosis and it turns out the node is currently configured to discard ABCI responses, which the relayer needs to work properly.

The same issue was described in #1383 and the solution is to go into the node's config and set the field discard_abci_responses = false. After that rly should have no problems connecting to the node and successfully relaying IBC packets.

jtieri self-assigned this Apr 23, 2024

jtieri assigned joelsmith-2019 May 20, 2024

jtieri closed this as completed May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reached max retries querying for block #1402

Reached max retries querying for block #1402

gaia commented Feb 20, 2024

gaia commented Feb 29, 2024

jtieri commented Feb 29, 2024

gaia commented Feb 29, 2024 •

edited

jtieri commented Apr 23, 2024

gaia commented Apr 24, 2024

jtieri commented Apr 30, 2024

gaia commented Apr 30, 2024

jtieri commented May 6, 2024

gaia commented May 7, 2024 •

edited

jtieri commented May 21, 2024

Reached max retries querying for block #1402

Reached max retries querying for block #1402

Comments

gaia commented Feb 20, 2024

gaia commented Feb 29, 2024

jtieri commented Feb 29, 2024

gaia commented Feb 29, 2024 • edited

jtieri commented Apr 23, 2024

gaia commented Apr 24, 2024

jtieri commented Apr 30, 2024

gaia commented Apr 30, 2024

jtieri commented May 6, 2024

gaia commented May 7, 2024 • edited

jtieri commented May 21, 2024

gaia commented Feb 29, 2024 •

edited

gaia commented May 7, 2024 •

edited