Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect the gRPC Poll CPU utilization #563

Open
JmPotato opened this issue Mar 12, 2022 · 3 comments
Open

Collect the gRPC Poll CPU utilization #563

JmPotato opened this issue Mar 12, 2022 · 3 comments

Comments

@JmPotato
Copy link
Member

Is your feature request related to a problem? Please describe.

As mentioned in tikv/tikv#12139, gRPC Poll is one of the major parts of CPU consumption on the hot path, so collecting its CPU utilization will be very useful.

Describe the solution you'd like

Each CompletionQueue has its own thread and each gRPC request will only create and resolve all events inside a single CompletionQueue to reduce the context switch. In each thread, the poll_queue takes up essentially all of the CPU consumption.

grpc-rs/src/env.rs

Lines 13 to 35 in ccd0fde

// event loop
fn poll_queue(tx: mpsc::Sender<CompletionQueue>) {
let cq = Arc::new(CompletionQueueHandle::new());
let worker_info = Arc::new(WorkQueue::new());
let cq = CompletionQueue::new(cq, worker_info);
tx.send(cq.clone()).expect("send back completion queue");
loop {
let e = cq.next();
match e.type_ {
EventType::GRPC_QUEUE_SHUTDOWN => break,
// timeout should not happen in theory.
EventType::GRPC_QUEUE_TIMEOUT => continue,
EventType::GRPC_OP_COMPLETE => {}
}
let tag: Box<CallTag> = unsafe { Box::from_raw(e.tag as _) };
tag.resolve(&cq, e.success != 0);
while let Some(work) = unsafe { cq.worker.pop_work() } {
work.finish();
}
}
}

So basically we can collect the thread CPU times and map the usage with each gRPC request easily. However, considering this crate should be a generic library, introducing this kind of feature should also be well-defined and easy to extend and use for the crate users, so a proper way to implement it should be discussed first.

@BusyJay
Copy link
Member

BusyJay commented Mar 14, 2022

Interesting idea! How to map the usage to requests? And what's the performance impact?

@JmPotato
Copy link
Member Author

Interesting idea! How to map the usage to requests? And what's the performance impact?

We can start an independent thread in the background to sample at a fixed frequency and each time the thread CPU times is sampled, we attach it to the current gRPC handler method name or gRPC context in the corresponding thread. After the tag.resolve() has been finished, we can retrieve the corresponding information and save it.

This idea is pretty similar to pprof or resource_metering in TiKV.

As for the performance, asynchronizing the above processes as much as possible minimizes the performance impact, but performance loss still exists in theory, and the exact overhead may need to be implemented and tested to be known.

@BusyJay
Copy link
Member

BusyJay commented Mar 16, 2022

Can the collection be implemented as a standalone crate so it can be used directly in libraries instead of reinvent the wheel? Does grpc census help in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants