Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added TransactionalGatherer with Cached implementation (+benchmark) #929

Closed
wants to merge 12 commits into from

Conversation

bwplotka
Copy link
Member

@bwplotka bwplotka commented Oct 24, 2021

Lot's of extra structs here, so feedback welcome.

Things to do still here:

  • Consider to add missed validation.

@beorn7
Copy link
Member

beorn7 commented Oct 27, 2021

I guess the "done" callback could work in principle. I have no better idea so far. (And that we now have to add yet another part (the transactional gathering) to the whole register/collect/gather contraption provides even more evidence that a rewrite/redesign will be beneficial… But yeah, this might be a way to introduce it into the current state of the package without breaking change.)

Having said that, the code doesn't really seem to make use of the feature. The mutex of the BlockingRegistry is only locked and unlocked in the (transactional) Gather method, but nowhere else, so the raw collectors would still be unprotected.

Talking about the rawCollector: Isn't that just a Gatherer? Look at the rawCollector interface:

type rawCollector interface {
	Collect() []*dto.MetricFamily
}

And here is the Gatherer interface:

type Gatherer interface {
	Gather() ([]*dto.MetricFamily, error)
}

So perhaps some things can be simplified. Look at the existing Gatherers type. That seems to already have of the parts needed. Perhaps we shouldn't have a CachedCollector (which anyway essentially implements the Gatherer interface) and a separate BlockingRegistry but simply a TransactionalGatherer implementation that accepts batches of updates to the metrics it gathers…

What's making me a bit dizzy is that it is hashing again without collision detection (there is a bunch of that in the existing Registry, and it's keeping me up at night…). But that's just a side thought.

I think the "single flight" problem can be solved with an RWMutex.

I'm very tempted to help out with more detailed suggestions, but, once more, lacking enough time… So my apologies for just uttering a few unsorted thoughts.

@bwplotka
Copy link
Member Author

Thanks!

Having said that, the code doesn't really seem to make use of the feature. The mutex of the BlockingRegistry is only locked and unlocked in the (transactional) Gather method, but nowhere else, so the raw collectors would still be unprotected.

Yea I might have missed the extra mutex lock in CachedCollector.

Thanks for pointers, I am back on this, this week, so will ping you when I will update it. Especially the simplifications you mentioned are promising. I was thinking a bit after the implementation and I am worried if cache logic should be part of client_golang or in the pieces that needs that (e.g advisor). Plus looks like in the wild we have two kind of cache updates:

  • Session based, where we recreate full state and want to reuse old metric families if possible
  • Update based where cache is updated with add/set/delete of particular items.

@bwplotka
Copy link
Member Author

On the rawCollector I think I wanted to ensure that we don't put collector/gatherer which expects blocking registry into non blocking one, since we would have race on returned metricfamilies

@bwplotka
Copy link
Member Author

Simplified API, added tests

@bwplotka bwplotka marked this pull request as ready for review January 21, 2022 12:30
@bwplotka bwplotka changed the title Added Blocking registry, with rawCollector and Transactional handler. Added TransactionalGatherer with Cached implementation. Jan 21, 2022
@bwplotka
Copy link
Member Author

Rdy for review @beorn7 @kakkoyun

@bwplotka bwplotka marked this pull request as draft January 21, 2022 18:06
@bwplotka
Copy link
Member Author

Found a race, changing a bit

@bwplotka bwplotka marked this pull request as ready for review January 21, 2022 19:11
@bwplotka
Copy link
Member Author

Simplified the abstraction yet added possibility to use this cache for both event based and session based approach cc @dgrisonnet @beorn7 @kakkoyun

PTAL (:

@bwplotka bwplotka changed the title Added TransactionalGatherer with Cached implementation. Added TransactionalGatherer with Cached implementation (+benchmark) Jan 21, 2022
Copy link
Member

@kakkoyun kakkoyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just skimmed. I'll continue to review it.

prometheus/cache.go Outdated Show resolved Hide resolved
prometheus/cache.go Outdated Show resolved Hide resolved
prometheus/cache/cache.go Outdated Show resolved Hide resolved
prometheus/cache/cache.go Outdated Show resolved Hide resolved
@kakkoyun
Copy link
Member

Let's hold on to this until we merge #974

@beorn7
Copy link
Member

beorn7 commented Jan 25, 2022

I hope I can review this in the not too far future. My queue is just long, and gets preempted often by urgent tasks.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

update.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Attempt 2

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Added blocking registry, with raw collector and transactional handler.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Added fast path to normal (empty) registry to save 8 allocs and 3K5B per Gather.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Simplified API, added tests.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Fix.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Simplified implementation.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Added benchmark.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>

Optimized.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
This reverts commit 2fcaf51.

Optimization was not worth it:

 benchstat v1.txt v2.txt
name                                                           old time/op    new time/op    delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12    2.64µs ± 0%    4.05µs ± 0%   ~     (p=1.000 n=1+1)
CachedTGatherer_Update/Update_of_all_elements_with_reset-12       701ms ± 0%     358ms ± 0%   ~     (p=1.000 n=1+1)
CachedTGatherer_Update/Gather-12                                  535µs ± 0%  703934µs ± 0%   ~     (p=1.000 n=1+1)

name                                                           old alloc/op   new alloc/op   delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12      208B ± 0%      208B ± 0%   ~     (all equal)
CachedTGatherer_Update/Update_of_all_elements_with_reset-12      40.2MB ± 0%    41.1MB ± 0%   ~     (p=1.000 n=1+1)
CachedTGatherer_Update/Gather-12                                 48.6kB ± 0%    84.3kB ± 0%   ~     (p=1.000 n=1+1)

name                                                           old allocs/op  new allocs/op  delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12      3.00 ± 0%      3.00 ± 0%   ~     (all equal)
CachedTGatherer_Update/Update_of_all_elements_with_reset-12        6.00 ± 0%   4003.00 ± 0%   ~     (p=1.000 n=1+1)
CachedTGatherer_Update/Gather-12                                  1.00k ± 0%     2.01k ± 0%   ~     (p=1.000 n=1+1)
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
@bwplotka
Copy link
Member Author

bwplotka commented Jan 26, 2022

Ok, I think I finished optimizing this. I preferred memory efficiency over slight delay in updating and gathering. There is some room to improve it further, but it will add more complexity to the code.

Current results (million metrics over 1k families):

 benchstat -delta-test=none v6.txt
name                                                           time/op
CachedTGatherer_Update/Update_of_one_element_without_reset-12  13.1ms ± 0%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12     309ms ± 0%
CachedTGatherer_Update/Gather-12                                422ms ± 0%

name                                                           alloc/op
CachedTGatherer_Update/Update_of_one_element_without_reset-12    208B ± 0%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12    2.47kB ± 0%
CachedTGatherer_Update/Gather-12                               52.8kB ± 0%

name                                                           allocs/op
CachedTGatherer_Update/Update_of_one_element_without_reset-12    3.00 ± 0%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12      0.00
CachedTGatherer_Update/Gather-12                                1.00k ± 0%

I think it's more than good for cadvisor/kubelet usage and potentially anyone else using this cache.

benchstat -delta-test=none v6.txt v9.txt
name                                                           old time/op    new time/op    delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12    13.1ms ± 0%     0.0ms ± 0%  -99.81%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12       309ms ± 0%     282ms ± 0%   -8.77%
CachedTGatherer_Update/Gather-12                                  422ms ± 0%       0ms ± 0%  -99.95%

name                                                           old alloc/op   new alloc/op   delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12      208B ± 0%      208B ± 0%    0.00%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12      2.47kB ± 0%    1.67kB ± 0%  -32.56%
CachedTGatherer_Update/Gather-12                                 52.8kB ± 0%    24.6kB ± 0%  -53.34%

name                                                           old allocs/op  new allocs/op  delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12      3.00 ± 0%      3.00 ± 0%    0.00%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12        0.00           0.00         0.00%
CachedTGatherer_Update/Gather-12                                  1.00k ± 0%     0.00k ± 0%  -99.60%

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
@bwplotka
Copy link
Member Author

Manage to optimize further - pretty good result:

benchstat -delta-test=none v6.txt v9.txt
name                                                           old time/op    new time/op    delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12    13.1ms ± 0%     0.0ms ± 0%  -99.81%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12       309ms ± 0%     282ms ± 0%   -8.77%
CachedTGatherer_Update/Gather-12                                  422ms ± 0%       0ms ± 0%  -99.95%

name                                                           old alloc/op   new alloc/op   delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12      208B ± 0%      208B ± 0%    0.00%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12      2.47kB ± 0%    1.67kB ± 0%  -32.56%
CachedTGatherer_Update/Gather-12                                 52.8kB ± 0%    24.6kB ± 0%  -53.34%

name                                                           old allocs/op  new allocs/op  delta
CachedTGatherer_Update/Update_of_one_element_without_reset-12      3.00 ± 0%      3.00 ± 0%    0.00%
CachedTGatherer_Update/Update_of_all_elements_with_reset-12        0.00           0.00         0.00%
CachedTGatherer_Update/Gather-12                                  1.00k ± 0%     0.00k ± 0%  -99.60%

@bwplotka
Copy link
Member Author

bwplotka commented Jan 27, 2022

While those results are great I found some pain points in cadvisor usage of it.

So apparently different kinds of metrics can be returned given the HTTP parameters. That makes it difficult for any caching logic which has only "memory" of the last state.

image

The solution we could do is to use two different caches (and live with allocating both) since we are lucky that only two options exist in cadvisor AFAIK. That is unfortunately not very sustainable, plus we keep mem for 2 caches. Probably better than it was with const metrics on every scrape, but still not the best.

There is one thoughts if we want to improve this situation:

  1. We could research a simple polling mechanism instead of cache. In the end, even in this cache we literally override all fields except labels. In this case, cadvisor might be able to just rebuild state on every needed interval while using pooling on e.g *Metric which if we keep polled items for longer, it can switch for the different request to different items. Something to think about.

All of this makes me think that such cache, for now, belongs to the user, not client_golang library. We should probably add TransactionalGatherer and other related improvements, but there are just many unknowns on API and behavior we need for this cache. I would propose moving cache for now to cadvisor repo and making this PR to only add those side HandlerForTransactional and TransactionalGatherer helpers, WDYT?

Usage can be seen here: google/cadvisor#2974

cc @dgrisonnet @beorn7 @kakkoyun

NOTE: I am 2 weeks sick off from today, so I might need to park this off. Anyone else is welcome to continue on this!

@dgrisonnet
Copy link

dgrisonnet commented Feb 7, 2022

I think that the issue of having to maintain 2 caches is a problem for cadvisor to handle as it really depends on their use case. If there is really a lot of difference between the two versions of the metrics, then it makes sense to have two separate caches. But, as far as I can tell, this docker option allows filtering the metrics that are returned. So this could perhaps be improved by hooking into the Gather() method and filtering the results. Although that approach might come with a slight increase in CPU usage, it should be reasonable.

All of this makes me think that such cache, for now, belongs to the user

I personally think otherwise. For most use cases the cache will never change, it will always have the same format, and the way to access it won't change. There are only a very few cases where users might want to have a more optimized cache (e.g kube-state-metrics). That said, I think it is worth making the cache pluggable, by defining an interface based on how it is accessed by the Gatherers, so that any projects could bring their own caches if they want to and it would still be beneficial to the majority of users to have a cache implementation in client_golang that they could easily reuse.

We could research a simple polling mechanism instead of cache.

I think that would be a great addition for registries that are strictly working with const metrics since their update may be event-based, but it will not work well with collectors. Taking that into account, an approach could be to use a queue to update the cache from the events:

From the consumer perspective:

  1. Receive an event (e.g container updated)
  2. Produce new const metrics
  3. Push metrics to a queue

In a goroutine started beforehand:

  1. Pop metrics from the queue
  2. Update the cache

That approach would require having a new type of registry whose sole role would be to look gather data from the cache, a queue to which users could push const metrics and a goroutine that would be responsible for poping metrics from the queue and syncing them in the cache.

@bwplotka
Copy link
Member Author

I think that the issue of having to maintain 2 caches is a problem for cadvisor to handle as it really depends on their use case. If there is really a lot of difference between the two versions of the metrics, then it makes sense to have two separate caches. But, as far as I can tell, this docker option allows filtering the metrics that are returned. So this could perhaps be improved by hooking into the Gather() method and filtering the results. Although that approach might come with a slight increase in CPU usage, it should be reasonable.

Makes sense 👍🏽

That said, I think it is worth making the cache pluggable, by defining an interface based on how it is accessed by the Gatherers,

That's my point. This PR makes it pluggable in the sense that anyone can implement TransactionalGatherer with their own caching logic. My cache is just one implementation:

func NewCachedTGatherer() *CachedTGatherer {
. Given it was done and tested only for cadvisor IMO we need more use cases to prove it deserves to be in this library.

Taking that into account, an approach could be to use a queue to update the cache from the events

Sorry, I don't get this queue idea, not sure what's the benefits of it. I created API that have both collector and event-based approach (based on reset flag). Would that work? What is missing for even based approach?

@dgrisonnet
Copy link

Given it was done and tested only for cadvisor IMO we need more use cases to prove it deserves to be in this library.

From a user perspective, it sounds complex to interact with this new API if there is no default cache implementation.

I created API that have both collector and event-based approach

I missed that, I thought the implementation wasn't event-based yet. I'll have a new look at your PR.

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
@bwplotka
Copy link
Member Author

Decided to close this for now. I will add another PR with just transactional interface.

Cache was efficient, but API causes extra work (those Insert structs) and errors like reusing same arrays. Let's try to come with something better in cadvisor first, only then port here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants