You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The question is how much of the CPU is then burned by the additional GC caused. In the past, we were certainly very paranoid with any kind of allocation. With better GC, this might be unfounded by now. But I guess we need a real-world end-to-end benchmark rather than a micro benchmark…
The question is how much of the CPU is then burned by the additional GC caused. In the past, we were certainly very paranoid with any kind of allocation. With better GC, this might be unfounded by now. But I guess we need a real-world end-to-end benchmark rather than a micro benchmark…
Thanks for the remind. We will setup a real-world E2E test to see if it could help and/or how much.
During benchmark, we found that the underlying summary implementation, i.e. bquant, wastes much CPU on
runtime.memorymove
.The benchmark here, https://github.com/beorn7/perks/blob/master/quantile/bench_test.go#L7-L23, cannot show the real performance since the recorded element is monotonically increasing. Thus lines containing
copy
are not reached.After using
container/list
, we get a better performance with a series of random float64 inserted, which is much more similar to the real scenario.Though using doubly-linked list introduce obj allocation, it perform double qps under micro-benchmark,
Linked List
Slice
The text was updated successfully, but these errors were encountered: