You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Looking at the metric sync_single_block_lookups on our nodes, they have 100k ~ 150k active lookups. The metric is properly implemented so this is a leak. Each lookup is quite small some hundreds of bytes so the leak is very slow and small overall.
A possible explanation is:
Create a new lookup for block A
Block A is already in the da_checker
lookup skips sending a block request because it's already in the da_checker
No need event for lookup is received, so it is never removed
Seems like the leak is happening atleast partly due to #5680 (comment)
RPCError::Disconnect not propagating up to sync could lead to awaiting_parent.is_some() lookups never getting resolved which means that they never get removed from the lookups map.
I did some testing with propagating the disconnects to sync. Doing this seems to result in lookups getting removed and sync_single_block_lookups metric getting back to 0 once the node is synced.
Not propagating the disconnects (like its happening currently in cut-5.2.0) is consistently increasing the lookup size on local testing.
Description
Looking at the metric
sync_single_block_lookups
on our nodes, they have 100k ~ 150k active lookups. The metric is properly implemented so this is a leak. Each lookup is quite small some hundreds of bytes so the leak is very slow and small overall.A possible explanation is:
Version
stable
Steps to resolve
Fixed with
The text was updated successfully, but these errors were encountered: