Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improving "Sparse postings" intersection #13971

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

alanprot
Copy link
Contributor

@alanprot alanprot commented Apr 23, 2024

Previously, the intersectPostings algorithm prioritized iterating through posting lists until finding intersections between them, neglecting the possibility of other lists having intersections beforehand. Consider the following example:

P1: [2, 5, 9, 18, 21]
P2: [3, 7, 14, 19, 21]
P3: [1, 21]

The algorithm would only advance through P1 and P2 until discovering an intersection and then checking P3. In essence, the traversal order was: 2, 3, 5, 7, 9, 14, 18, 19, 21 (intersection found).

With the proposed change, P3 is also examined even if P1 and P2 haven't found an intersection yet. This adjustment allows for the possibility of skipping some iterations.

Post-change, the traversal order becomes: 2, 3, 21 (3 iterations instead of 9).

To validate the improvement, benchmarks were adjusted to simulate this scenario. Additionally, calling next on the resulting postings demonstrates the benefits. In this extreme case, a significant 97% reduction in time is observed.


goos: linux
goarch: amd64
pkg: github.com/prometheus/prometheus/tsdb
cpu: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
                                                          │   /tmp/old    │              /tmp/new              │
                                                          │    sec/op     │   sec/op     vs base               │
Querier/Block/PostingsForMatchers/n="0_1",j="foo,a="1"-32   15815.1µ ± 7%   431.4µ ± 3%  -97.27% (p=0.002 n=6)

                                                          │  /tmp/old  │           /tmp/new            │
                                                          │    B/op    │    B/op     vs base           │
Querier/Block/PostingsForMatchers/n="0_1",j="foo,a="1"-32   336.0 ± 0%   336.0 ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

                                                          │  /tmp/old  │           /tmp/new            │
                                                          │ allocs/op  │ allocs/op   vs base           │
Querier/Block/PostingsForMatchers/n="0_1",j="foo,a="1"-32   13.00 ± 0%   13.00 ± 0%  ~ (p=1.000 n=6) ¹
¹ all samples are equal

Ideally we could sort the arr []Postings by size before iterating but, unfortunately, the postings interface does not allow us to retrieve the underlining size.

Signed-off-by: alanprot <alanprot@gmail.com>
@GiedriusS
Copy link
Contributor

I tried adding a length method here #13093, maybe worth revisiting?

Copy link
Contributor

@GiedriusS GiedriusS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does the whole BenchmarkQuerier look like compared to main?

@alanprot
Copy link
Contributor Author

I did not run the whole thing but its mostly showing the improvement in that case (99% though)...

Idk if is worth it... but it seems to make sense as "why would we deprioritize some postings over others?"

But idk tbh! hhaha

pkg: github.com/prometheus/prometheus/tsdb
cpu: Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
                                                                         │     /tmp/old     │                /tmp/new2                │
                                                                         │      sec/op      │    sec/op     vs base                   │
Querier/Head/PostingsForMatchers/n="1"-32                                      659.9µ ± ∞ ¹   659.6µ ± ∞ ¹        ~ (p=0.056 n=5)
Querier/Head/PostingsForMatchers/n="X"-32                                      581.4n ± ∞ ¹   586.4n ± ∞ ¹   +0.86% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",j="foo"-32                              14.46m ± ∞ ¹   14.39m ± ∞ ¹   -0.46% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="X",j="foo"-32                              652.6n ± ∞ ¹   661.5n ± ∞ ¹   +1.36% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/j="foo",n="1"-32                              14.67m ± ∞ ¹   14.40m ± ∞ ¹   -1.80% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",j!="foo"-32                             16.53m ± ∞ ¹   15.20m ± ∞ ¹   -8.04% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i!="2"-32                               1.766m ± ∞ ¹   1.768m ± ∞ ¹   +0.15% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="0_1",j="foo,a="1"-32                   12460.907µ ± ∞ ¹   1.553µ ± ∞ ¹  -99.99% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="X",j!="foo"-32                             623.5n ± ∞ ¹   631.8n ± ∞ ¹   +1.33% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i=~"1[0-9]",j=~"foo|bar"-32                   1.702µ ± ∞ ¹   1.715µ ± ∞ ¹        ~ (p=0.056 n=5)
Querier/Head/PostingsForMatchers/j=~"foo|bar"-32                               92.34m ± ∞ ¹   91.43m ± ∞ ¹   -0.98% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/j=~"XXX|YYY"-32                               843.3n ± ∞ ¹   849.4n ± ∞ ¹   +0.72% (p=0.016 n=5)
Querier/Head/PostingsForMatchers/j=~"X.+"-32                                   709.8n ± ∞ ¹   711.5n ± ∞ ¹        ~ (p=1.000 n=5)
Querier/Head/PostingsForMatchers/i=~"(1|2|3|4|5|6|20|55)"-32                   1.400µ ± ∞ ¹   1.429µ ± ∞ ¹   +2.07% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i!~"(1|2|3|4|5|6|20|55)"-32                   16.79m ± ∞ ¹   16.56m ± ∞ ¹   -1.36% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i=~"X|Y|Z"-32                                 958.3n ± ∞ ¹   949.4n ± ∞ ¹   -0.93% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i!~"X|Y|Z"-32                                 16.66m ± ∞ ¹   16.56m ± ∞ ¹        ~ (p=0.056 n=5)
Querier/Head/PostingsForMatchers/i=~".*"-32                                    25.06m ± ∞ ¹   24.76m ± ∞ ¹        ~ (p=0.222 n=5)
Querier/Head/PostingsForMatchers/i=~"1.*"-32                                   40.62m ± ∞ ¹   40.97m ± ∞ ¹        ~ (p=0.056 n=5)
Querier/Head/PostingsForMatchers/i=~".*1"-32                                   5.496m ± ∞ ¹   5.369m ± ∞ ¹   -2.32% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i=~".+"-32                                    456.7m ± ∞ ¹   465.8m ± ∞ ¹   +1.98% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i=~".+",j=~"X.+"-32                           45.76m ± ∞ ¹   45.89m ± ∞ ¹        ~ (p=0.222 n=5)
Querier/Head/PostingsForMatchers/i=~""-32                                      673.5m ± ∞ ¹   692.7m ± ∞ ¹   +2.84% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/i!=""-32                                      448.6m ± ∞ ¹   459.1m ± ∞ ¹        ~ (p=0.095 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".*",j="foo"-32                      21.72m ± ∞ ¹   21.67m ± ∞ ¹        ~ (p=0.421 n=5)
Querier/Head/PostingsForMatchers/n="X",i=~".*",j="foo"-32                      711.1n ± ∞ ¹   719.5n ± ∞ ¹   +1.18% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".*",i!="2",j="foo"-32               22.08m ± ∞ ¹   22.40m ± ∞ ¹        ~ (p=0.095 n=5)
Querier/Head/PostingsForMatchers/n="1",i!=""-32                                114.2m ± ∞ ¹   118.2m ± ∞ ¹   +3.57% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i!="",j="foo"-32                        132.4m ± ∞ ¹   135.0m ± ∞ ¹   +1.95% (p=0.016 n=5)
Querier/Head/PostingsForMatchers/n="1",i!="",j=~"X.+"-32                       33.90m ± ∞ ¹   35.30m ± ∞ ¹   +4.13% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i!="",j=~"XXX|YYY"-32                   33.24m ± ∞ ¹   35.32m ± ∞ ¹   +6.25% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~"X|Y|Z",j="foo"-32                   1.341µ ± ∞ ¹   1.334µ ± ∞ ¹        ~ (p=0.230 n=5)
Querier/Head/PostingsForMatchers/n="1",i!~"X|Y|Z",j="foo"-32                   14.45m ± ∞ ¹   14.42m ± ∞ ¹   -0.23% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".+",j="foo"-32                      141.6m ± ∞ ¹   143.8m ± ∞ ¹        ~ (p=0.095 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~"1.+",j="foo"-32                     16.10m ± ∞ ¹   16.13m ± ∞ ¹        ~ (p=1.000 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".*1.*",j="foo"-32                   61.80m ± ∞ ¹   64.20m ± ∞ ¹   +3.88% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".+",i!="2",j="foo"-32               141.1m ± ∞ ¹   144.8m ± ∞ ¹   +2.63% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".+",i!~"2.*",j="foo"-32             161.4m ± ∞ ¹   163.5m ± ∞ ¹        ~ (p=0.151 n=5)
Querier/Head/PostingsForMatchers/n="1",i=~".+",i!~".*2.*",j="foo"-32           198.2m ± ∞ ¹   202.6m ± ∞ ¹   +2.22% (p=0.008 n=5)
Querier/Head/PostingsForMatchers/n="X",i=~".+",i!~".*2.*",j="foo"-32           821.6n ± ∞ ¹   827.9n ± ∞ ¹   +0.77% (p=0.008 n=5)
Querier/Head/labelValuesWithMatchers/i_with_i="1"-32                           2.525m ± ∞ ¹   2.515m ± ∞ ¹        ~ (p=0.310 n=5)
Querier/Head/labelValuesWithMatchers/i_with_n="1"-32                           202.5m ± ∞ ¹   209.7m ± ∞ ¹   +3.57% (p=0.008 n=5)
Querier/Head/labelValuesWithMatchers/i_with_n="^.+$"-32                        139.0m ± ∞ ¹   142.4m ± ∞ ¹        ~ (p=0.095 n=5+2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants