Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question: How to optimize the transmission efficiency of pubsub? #527

Open
JacksonRGB opened this issue Mar 14, 2023 · 3 comments
Open

question: How to optimize the transmission efficiency of pubsub? #527

JacksonRGB opened this issue Mar 14, 2023 · 3 comments

Comments

@JacksonRGB
Copy link

JacksonRGB commented Mar 14, 2023

I am currently using pubsub with grpc proxy, and the maximum TPS tested in the same AWS availability zone is 800. When continuously sending data of 20kb, the transmission bandwidth is about 16MB/s, and when continuously sending data of 40kb, the transmission bandwidth is about 32MB/s. The CPU usage is always less than 200% on a 4-core CPU.

The configuration I am using is:

pubsub.WithMessageSignaturePolicy(pubsub.StrictNoSign)
pubsub.WithNoAuthor()
pubsub.WithMessageIdFn(msgID)

Found the reason: the custom msgID function was consuming too much time.

this is my function

func msgID(pmsg *pubsubpb.Message) string {
	h := sha256.Sum256(pmsg.Data)
	return fmt.Sprintf("%x", h[:20])
}

if I do this, I got a test result of 72MB/s (3600TPS).

func msgID(pmsg *pubsubpb.Message) string {
	return fmt.Sprintf("%x", rand.Int63())
}

I have tried many hash functions such as xxhash, and during the benchmark test xxHash was 30 times faster than sha256. However, in actual testing, TPS remained at 800.

BenchmarkMsgID/msgIDSha256-8            22498             53323 ns/op
BenchmarkMsgID/msgIDRandom-8          9200742               128.6 ns/op
BenchmarkMsgID/msgIDxxHASH-8           727844              1650 ns/op

update:

If I read the pubsub.Message, it becomes very slow.(800TPS)

func msgIDxxHash(pmsg *pubsubpb.Message) string {
        // h := xxhash.Sum64(pmsg.Data[:100])
        return fmt.Sprintf("%x", pmsg.Data[:100])
}
@lthibault
Copy link
Contributor

@minchenzz Can you post pprof/trace data?

My immediate question for you is whether you are using Ed25519 for signature verification. In my experience, that's a quick and easy win for performance.

@JacksonRGB
Copy link
Author

@lthibault Thanks for your help! I didn't use ED25519. I want to remove duplicates by using msgID.


sha256ID

pprof/trace

trace-sha256.png

pprof/profile

profile_sha256.png


xxhashID

pprof/trace

xxhash-trace.png

pprof/profile

profile_xxhash.png


randomID

pprof/trace

trace-random.png

pprof/profile

profile_random.png

@lthibault
Copy link
Contributor

I don't see anything obvious there. You might also consider running a CPU and memory allocation profile to see if there's anything there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants