Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Histogram aggregation #1306

Merged
merged 14 commits into from Mar 18, 2022
Merged

add Histogram aggregation #1306

merged 14 commits into from Mar 18, 2022

Conversation

PSeitz
Copy link
Contributor

@PSeitz PSeitz commented Mar 9, 2022

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented Mar 9, 2022

Codecov Report

Merging #1306 (6e357c8) into main (4b62f79) will increase coverage by 0.15%.
The diff coverage is 98.86%.

❗ Current head 6e357c8 differs from pull request most recent head f619658. Consider uploading reports for the commit f619658 to get more accurate results

@@            Coverage Diff             @@
##             main    #1306      +/-   ##
==========================================
+ Coverage   94.08%   94.23%   +0.15%     
==========================================
  Files         230      231       +1     
  Lines       39191    40758    +1567     
==========================================
+ Hits        36872    38409    +1537     
- Misses       2319     2349      +30     
Impacted Files Coverage Δ
src/aggregation/bucket/range.rs 98.37% <ø> (ø)
src/aggregation/agg_req.rs 96.22% <82.14%> (-3.02%) ⬇️
src/aggregation/intermediate_agg_result.rs 97.91% <96.91%> (-0.97%) ⬇️
src/aggregation/bucket/histogram/histogram.rs 99.29% <99.29%> (ø)
src/aggregation/agg_req_with_accessor.rs 97.82% <100.00%> (+0.02%) ⬆️
src/aggregation/agg_result.rs 100.00% <100.00%> (ø)
src/aggregation/collector.rs 100.00% <100.00%> (+4.16%) ⬆️
src/aggregation/metric/average.rs 86.20% <100.00%> (+1.72%) ⬆️
src/aggregation/metric/stats.rs 98.31% <100.00%> (+0.75%) ⬆️
src/aggregation/mod.rs 98.53% <100.00%> (+0.55%) ⬆️
... and 28 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4b62f79...f619658. Read the comment docs.

@PSeitz PSeitz force-pushed the histogram branch 5 times, most recently from 4cb8932 to a88e260 Compare March 11, 2022 13:14
.count();
let num_buckets = self.buckets.len();

let buckets = self
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the saturating_sub tries to address the edge case, but that's a cryptic way to write code.

Buckets is a Vec... There is no need for magic tricks.
For instance, you could compute skip_end first, working on the slice [skip_start..].

You could also, compute skip_end first, and thank skip heading element and collect on the same iterator.

Copy link
Contributor Author

@PSeitz PSeitz Mar 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you solved the riddle I can remove it ;)
I just realized we can simply filter all empty buckets.

Edit: I started with the slice, but here we want to get the elements by value to avoid clones

src/aggregation/mod.rs Outdated Show resolved Hide resolved
src/aggregation/mod.rs Outdated Show resolved Hide resolved
@@ -239,6 +248,10 @@ pub enum Key {
F64(f64),
}

trait MergeFruits {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this trait useful? I'm not sure I understand its point.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it is used to share code in the merge_maps thingy?

Why is it not declared in intermediate_agg_result then, close to that function then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, so multiple buckets can share the merge code on collections

I moved it to intermediate_agg_result

Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see inline comments.

pub(crate) struct SegmentHistogramBucketEntry {
pub key: f64,
pub doc_count: u64,
pub sub_aggregation: Option<SegmentAggregationResultsCollector>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should also keep sub_aggregation as an external Vec<Option<SegmentAggregationResultsCollector>>.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's something I wanted to test as well, but I would probably go for Option<Vec<SegmentAggregationResultsCollector>>

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's much faster like that, 2x in some cases

src/aggregation/mod.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comment inline.

Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same file same comment. the code is needlessly complicated.

@PSeitz PSeitz force-pushed the histogram branch 5 times, most recently from 1a2dd86 to 521302a Compare March 17, 2022 08:23
.collect(),
)
impl AggregationResults {
/// Convert and intermediate result and its aggregation request to the final result
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very helpful

PSeitz and others added 2 commits March 18, 2022 04:55
Co-authored-by: Paul Masurel <paul@quickwit.io>
Co-authored-by: Paul Masurel <paul@quickwit.io>
@PSeitz PSeitz merged commit 141b9aa into quickwit-oss:main Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants