Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Latency when doing batch calls #442

Open
harshsingh-cs opened this issue May 7, 2024 · 8 comments
Open

High Latency when doing batch calls #442

harshsingh-cs opened this issue May 7, 2024 · 8 comments

Comments

@harshsingh-cs
Copy link

harshsingh-cs commented May 7, 2024

I am currently using ethereum-etl tool to do some backfilling but I have observed some high latency when doing so. Mainly all the requests are batch requests.
There is no visible bottleneck on the Disk, CPU or RAM but still the response time from the node is significantly high
image

Here is my startup command

       - ronin
        - --config=/root/geth-config/config.toml
        - --rpc.evmtimeout=5s
        - --cache=9830
        - --networkid=2020
        - --port=30303
        - --syncmode=full
        - --maxpeers=100
        - --txlookuplimit=2350000
        - --txpool.pricebump=10
        - --txpool.pricelimit=20000000000
        - --txpool.nolocals
        - --http
        - --http.addr=0.0.0.0
        - --http.port=8545
        - --http.vhosts=*
        - --http.corsdomain=null
        - --http.api=net,eth,web3
        - --ws
        - --ws.addr=0.0.0.0
        - --ws.port=8546
        - --ws.api=net,eth,web3
        - --ws.origins=*
        - --rpc.txfeecap=100
        - --rpc.gascap=500000000
        - --metrics
        - --metrics.addr=0.0.0.0
        - --metrics.port=6060
        - --bootnodes=..
        image: ghcr.io/axieinfinity/ronin:v2.7.1-c64ddd3

config.toml

[Eth]
TrieTimeout = 3600000000000

[Node.HTTPTimeouts]
ReadTimeout = 30000000000
WriteTimeout = 30000000000
IdleTimeout = 120000000000

Can you please suggest what is causing this and what can I improve

@minh-bq
Copy link
Contributor

minh-bq commented May 7, 2024

What is the command that you use to run export_blocks_and_transactions.py so that we can try to reproduce on our side?

@harshsingh-cs
Copy link
Author

@minh-bq you can use this

ethereumetl export_blocks_and_transactions --start-block 34010000 --end-block 34090000 --provider-uri <node url> --blocks-output blocks-ronin.json --transactions-output txns-ronin.csv

@minh-bq
Copy link
Contributor

minh-bq commented May 7, 2024

I've tried in our node and it worked just fine, the latency was quite low
image

@harshsingh-cs
Copy link
Author

@minh-bq can you share the output of ethereum-etl specifically how much time it took to complete the blocks
For example

root@ethereum-etl# ethereumetl export_blocks_and_transactions --start-block 34010000 --end-block 34090000 --provider-uri https://<node url> --blocks-output ..//blocks-ronin.json --transactions-output ../txns-ronin.csv
2024-05-13 11:16:54,476 - ProgressLogger [INFO] - Started work. Items to process: 80001.
2024-05-13 11:17:28,834 - ProgressLogger [INFO] - 8100 items processed. Progress is 10%.
2024-05-13 11:17:53,816 - ProgressLogger [INFO] - 16100 items processed. Progress is 20%.
2024-05-13 11:19:04,945 - ProgressLogger [INFO] - 24100 items processed. Progress is 30%.
2024-05-13 11:20:16,429 - ProgressLogger [INFO] - 32100 items processed. Progress is 40%.
2024-05-13 11:20:48,081 - ProgressLogger [INFO] - 40100 items processed. Progress is 50%.
2024-05-13 11:21:10,837 - ProgressLogger [INFO] - 48100 items processed. Progress is 60%.
2024-05-13 11:22:42,877 - ProgressLogger [INFO] - 56100 items processed. Progress is 70%.
2024-05-13 11:23:20,189 - ProgressLogger [INFO] - 64100 items processed. Progress is 80%.
2024-05-13 11:23:44,012 - ProgressLogger [INFO] - 72100 items processed. Progress is 90%.
2024-05-13 11:24:31,089 - ProgressLogger [INFO] - 80001 items processed. Progress is 100%.
2024-05-13 11:24:31,090 - ProgressLogger [INFO] - Finished work. Total items processed: 80001. Took 0:07:36.613566.
2024-05-13 11:24:31,090 - CompositeItemExporter [INFO] - block items exported: 80001
2024-05-13 11:24:31,090 - CompositeItemExporter [INFO] - transaction items exported: 4862373

You can see in the above output that it prints the total time taken
Finished work. Total items processed: 80001. Took 0:07:36.613566.

@minh-bq
Copy link
Contributor

minh-bq commented May 15, 2024

Okay, I understand the problem now. I see very high CPU usage (> 80%) on the node that I connect when running the above command. This is a thing we can look into for improvement but I think it is not unusual. With the above command, ethereumetl will run 5 workers and batch 100 rpc requests per HTTP request. The command calls eth_getBlockByNumber and includes full transaction information, this consumes a lot of CPU to recover the sender of transactions (at the block range, I see there are a lot of transactions).

I think you try to increase the number of CPUs on the node, use -b to configure the batch size, -w to configure the number of workers when running ethereumetl. More workers means more parallel HTTP requests and AFAIK HTTP requests are handled parallel (however, you need to make sure that the node has enough number of CPUs). The RPC requests in the same batch HTTP request are executed sequentially.

@harshsingh-cs
Copy link
Author

@minh-bq we also see the increase in CPU but not so much so that it would cause the reqs to slow down. Can you share the total time it took to process all the blocks

@minh-bq
Copy link
Contributor

minh-bq commented May 21, 2024

2024-05-21 07:02:25,756 - ProgressLogger [INFO] - Started work. Items to process: 80001.
2024-05-21 07:03:10,625 - ProgressLogger [INFO] - 8100 items processed. Progress is 10%.
2024-05-21 07:03:41,731 - ProgressLogger [INFO] - 16100 items processed. Progress is 20%.
2024-05-21 07:05:20,431 - ProgressLogger [INFO] - 24100 items processed. Progress is 30%.
2024-05-21 07:06:36,485 - ProgressLogger [INFO] - 32100 items processed. Progress is 40%.
2024-05-21 07:07:11,415 - ProgressLogger [INFO] - 40100 items processed. Progress is 50%.
2024-05-21 07:07:35,074 - ProgressLogger [INFO] - 48100 items processed. Progress is 60%.
2024-05-21 07:09:48,521 - ProgressLogger [INFO] - 56100 items processed. Progress is 70%.
2024-05-21 07:10:34,355 - ProgressLogger [INFO] - 64100 items processed. Progress is 80%.
2024-05-21 07:11:02,096 - ProgressLogger [INFO] - 72100 items processed. Progress is 90%.
2024-05-21 07:12:00,831 - ProgressLogger [INFO] - 80001 items processed. Progress is 100%.
2024-05-21 07:12:00,838 - ProgressLogger [INFO] - Finished work. Total items processed: 80001. Took 0:09:35.082243.
2024-05-21 07:12:00,839 - CompositeItemExporter [INFO] - block items exported: 80001
2024-05-21 07:12:01,180 - CompositeItemExporter [INFO] - transaction items exported: 4862373

I get the same amount of time. I've just tried on a more CPUs machine and seems like CPU is underutilized.

@minh-bq
Copy link
Contributor

minh-bq commented May 21, 2024

diff --git a/internal/ethapi/api.go b/internal/ethapi/api.go
index 6738b627a..79d93e955 100644
--- a/internal/ethapi/api.go
+++ b/internal/ethapi/api.go
@@ -21,6 +21,8 @@ import (
 	"errors"
 	"fmt"
 	"math/big"
+	"runtime"
+	"sync"
 	"time"
 
 	"github.com/davecgh/go-spew/spew"
@@ -52,9 +54,16 @@ import (
 
 var ErrMethodNotSupport = errors.New("method is not supported")
 
-// estimateGasErrorRatio is the amount of overestimation eth_estimateGas is
-// allowed to produce in order to speed up calculations.
-const estimateGasErrorRatio = 0.015
+const (
+	// estimateGasErrorRatio is the amount of overestimation eth_estimateGas is
+	// allowed to produce in order to speed up calculations.
+	estimateGasErrorRatio = 0.015
+
+	// minBlockTransactionThreshold is the minimum number of transactions in block
+	// that we start more go routines to marshal transaction in GetBlockByNumber
+	// when full transaction is requested
+	minBlockTransactionThreshold = 20
+)
 
 // PublicEthereumAPI provides an API to access Ethereum related information.
 // It offers only methods that operate on public data that is freely available to anyone.
@@ -1257,10 +1266,31 @@ func RPCMarshalBlock(block *types.Block, inclTx bool, fullTx bool, config *param
 		}
 		txs := block.Transactions()
 		transactions := make([]interface{}, len(txs))
-		var err error
-		for i, tx := range txs {
-			if transactions[i], err = formatTx(tx); err != nil {
-				return nil, err
+
+		if fullTx && len(txs) > minBlockTransactionThreshold {
+			numRoutines := runtime.NumCPU()
+			numTxPerRoutine := (len(txs) + numRoutines - 1) / numRoutines
+
+			var wg sync.WaitGroup
+			wg.Add(numRoutines)
+			for i := 0; i < numRoutines; i++ {
+				go func(start int) {
+					defer wg.Done()
+
+					for j := start; j < start+numTxPerRoutine && j < len(txs); j++ {
+						transactions[j], _ = formatTx(txs[j])
+					}
+
+				}(i * numTxPerRoutine)
+			}
+
+			wg.Wait()
+		} else {
+			var err error
+			for i, tx := range txs {
+				if transactions[i], err = formatTx(tx); err != nil {
+					return nil, err
+				}
 			}
 		}
 		fields["transactions"] = transactions

Can you try this patch to see if there is improvement (it is not well tested though, so please cross check the data if you want to use it)? In my opinion, ethereumetl might do better to improve the overall time. During testing, I cannot make the RPS exceed 300. From our chart, I saw that RPS is not stable and there are drops in the middle of exporting. This leads to the node's CPU is not fully utilized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants