New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some aggregations and functions produce incorrect results for native histograms #13934
Comments
Related: #11973 |
My thoughts: I think the (float, histogram) args for arithmetic binary operators needs to be broken up. For example, For I think the current behavior of
|
The basic idea is this: Where operations and aggregations make sense (between histogram and histogram, or between histogram and float), they happen. Where they don't make sense, the histogram is treated as if the vector element with the histogram did not exist. This is not an error, but we attach a "warn"-level annotation (or maybe "info"-level only?) if any vector element is label-matched with an ignored histogram. That implies no annotation is added if two ignored histograms are matched or if an ignored histogram is matched with a scalar (which is not label-matching). The rationale here is that label matching might be complex, and if a label match leads to an impossible combination, it might be accidental and the user should know about it. In contrast to that, (This is all just my stream of thoughts. Feel free to express other viewpoints.) |
Yes, sorry, I missed this case. I'll break the operators further with different arguments and update the table.
So the gist here is that after grouping the labels if the vector contains both
yeah, my bad, I somehow got confused between I'll make all these changes into the table. |
In my opinion if the operation makes sense and something is returned from that operation but we have ignored histogram in the vector ( like |
Yes, that's the idea (but the case of how to deal with histogram-only could still be debated, see your later comment). |
That's also an option. If I understand you correctly, you would add an info annotation to the Another thought in this regard: A mix of floats and histograms in the same vector should be a very rare case. So maybe we don't need to think too much about treating that case specially or with concerns about spammy annotations. It shouldn't really happen in practice, and if it does, it's probably a faulty setup or a transition period and will be fixed soon. So maybe let's say for now, whenever we ignore a histogram in an operation because it doesn't make sense, we add an info-level annotation saying that an operation wasn't performed, even if the expression still yields a result from other vector elements. We can do this first, it's easy, and then iterate on it once we see how it plays out in practice. |
I have updated the table above according to my last paragraph above. |
Thanks for updating the table Beorn. I was busy and couldn't update the table fully. |
Currently there are some operators and aggregations that produce either incorrect or inconsistent results for native histograms in some cases. Which are-
+
and-
float
,histogram
in any orderfloat
*
histogram
,histogram
float
/
float
,histogram
(in this order!histogram
,float
is fine)float
ignoring histogram/
histogram
,0
histogram
with every value divided by zerohistogram
withsum
andcount
asNaN
orInf
and all buckets removedmin
andmax
histogram
andfloat
scalar
by takinghistogram
as 0min
/max
of all the floats and add info annotationmin
andmax
histogram
scalar
by takinghistogram
as 0Here I've added the cases which I've found till now. I'll update it with more such cases subsequently and also will fix these issues. If you have encountered any such cases then add them here either by editing it or comment them.
The text was updated successfully, but these errors were encountered: