Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profile memory usage requirements of RPC and archival nodes in stateless validation #11230

Open
Tracked by #11157
tayfunelmas opened this issue May 3, 2024 · 2 comments
Assignees
Labels
A-stateless-validation Area: stateless validation

Comments

@tayfunelmas
Copy link
Contributor

tayfunelmas commented May 3, 2024

Assuming that we will launch stateless validation with RPC nodes tracking all shards using memtrie, analyze the memory requirements from RPC nodes in this setup.

As arule of thumb, after stateless validation is launched, if a node tracks all shards and needs to catch up with the network, then it needs to enable memtries, because the network will start operating faster due to memtries and tracking less shards (well it is the premise of the stateless validation), so without memtrie any other node tracking all shards will fall behind.

Thus, we need to understand how much memory RPC and archival nodes will need when they start using memtries while tracking all shards. This could be done by spinning up an RPC node with fresh neard and tracking the delta between with and without memtries:

  1. RAM usage (should increase but how much?)
  2. Disk usage (should not change)
  3. Something indicating chunk processing like apply-chunk latency (how much?)
@tayfunelmas tayfunelmas added the A-stateless-validation Area: stateless validation label May 3, 2024
@telezhnaya telezhnaya self-assigned this May 9, 2024
@tayfunelmas tayfunelmas changed the title Profile memory usage requirements of RPC nodes in stateless validation Profile memory usage requirements of RPC and archival nodes in stateless validation May 9, 2024
@telezhnaya
Copy link
Contributor

telezhnaya commented May 13, 2024

@tayfunelmas

I tried to design a way how to estimate the requirements, but there are too many unknowns.
Let's have a look at the existing requirements
https://near-nodes.io/rpc/hardware-rpc
Is it for mainnet? For testnet? What's about localnet? What if I have localnet 10 times more congested than mainnet?
What if I want to make millions of queries per second? Will it change the requirements?

Instruction says that recommended configuration is 8 cores, 20 GB RAM. The minimum is 8 cores, 12 GB RAM.
TBH, I find these numbers useless.
For tiny localnet, 1 core and 4 GB RAM would be more than enough.

Let's assume it's for mainnet, and let's look at the use cases.
In reality, Pagoda runs each mainnet regular node on 32 vCPU and 128 GB memory.
I know that some of our partners run even beefier machines.
Should we update the recommended configuration with these numbers? Why don't we use the recommended configuration?

I can suggest 2 options:

  1. Redesign this doc from 0, suggesting different configurations for different use cases
  2. Leave everything as it is and bump the numbers if the users will complain

I personally vote for the second option, because otherwise we should update the doc each time we have yet another network congestion

If we decide to redesign the doc from 0, we need to start from defining audience for this doc. It's now unclear for me, who we're trying to help.

@walnut-the-cat
Copy link
Contributor

cc. @tayfunelmas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stateless-validation Area: stateless validation
Projects
None yet
Development

No branches or pull requests

3 participants