Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add benchmark of term streams merge #1024

Merged
merged 6 commits into from
May 31, 2021
Merged

Conversation

scampi
Copy link
Contributor

@scampi scampi commented Apr 26, 2021

This is a draft PR that adds a benchmark of the term merging operation, to make sure I understood what was expected.
I get a result such as below for the baseline:

test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 248,879,424 ns/iter (+/- 8,600,537)

Close #469

src/termdict/merger.rs Outdated Show resolved Hide resolved
src/termdict/merger.rs Outdated Show resolved Hide resolved
src/termdict/merger.rs Outdated Show resolved Hide resolved
src/termdict/merger.rs Outdated Show resolved Hide resolved
Copy link
Collaborator

@fulmicoton fulmicoton left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See inline comments.

@scampi
Copy link
Contributor Author

scampi commented May 18, 2021

Thanks for the comments !
I will address them but I'll create a fst-based term merging prototype before pushing any new commits.

@fulmicoton
Copy link
Collaborator

@scampi apologies for the late reaction. I actually wrote the comments 3 weeks ago but forgot to hit the send review button >.<

@fulmicoton
Copy link
Collaborator

If you have trouble with rebase, I can do it... Let me know once you addressed the comments

@scampi
Copy link
Contributor Author

scampi commented May 25, 2021

@fulmicoton see commit 27b29d6 with a possible term merger implementation based on the FST union.
Below are some runs of the benchmark. I see some variability in the run, the union approach is not always significantly better than the baseline, although it seems to be faster in general.

# fst union

test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 257,954,291 ns/iter (+/- 13,201,530)
test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 234,322,785 ns/iter (+/- 43,848,128)
test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 341,135,396 ns/iter (+/- 24,285,123)

# baseline

test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 291,538,403 ns/iter (+/- 54,753,066)
test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 349,626,129 ns/iter (+/- 78,527,658)
test termdict::merger::bench::bench_termmerger_baseline                                         ... bench: 376,261,843 ns/iter (+/- 15,450,698)

@fulmicoton fulmicoton requested a review from PSeitz May 25, 2021 08:13
@scampi scampi marked this pull request as ready for review May 26, 2021 08:38
src/termdict/merger.rs Outdated Show resolved Hide resolved
src/termdict/merger.rs Outdated Show resolved Hide resolved
src/termdict/merger.rs Outdated Show resolved Hide resolved
src/termdict/merger.rs Outdated Show resolved Hide resolved
@scampi
Copy link
Contributor Author

scampi commented May 26, 2021

@PSeitz See commit c899c1c with fixes for your review. Thanks!

Copy link
Contributor

@PSeitz PSeitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! LGTM

@fulmicoton
Copy link
Collaborator

@scampi Thanks it is a very valuable PR for tantivy. Can you add an extra commit to update CHANGELOG.md with your contribution?

@scampi
Copy link
Contributor Author

scampi commented May 28, 2021

@fulmicoton Please see commit 66b633a with changelog update

@fulmicoton fulmicoton merged commit 41ea148 into quickwit-oss:main May 31, 2021
@fulmicoton
Copy link
Collaborator

@scampi Congratulations !

@scampi scampi deleted the issue-469 branch May 31, 2021 22:25
This was referenced Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize term merging
3 participants