Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No score calls if score is not requested #1646

Merged
merged 4 commits into from Nov 1, 2022
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
@@ -1,6 +1,7 @@
Tantivy 0.19
================================

- Skip score calculation, when no scoring is required [#1646](https://github.com/quickwit-oss/tantivy/pull/1646) (@PSeitz)
- Limit fast fields to u32 (`get_val(u32)`) [#1644](https://github.com/quickwit-oss/tantivy/pull/1644) (@PSeitz)
- Major bugfix: Fix missing fieldnorms for u64, i64, f64, bool, bytes and date [#1620](https://github.com/quickwit-oss/tantivy/pull/1620) (@PSeitz)
- Updated [Date Field Type](https://github.com/quickwit-oss/tantivy/pull/1396)
Expand Down
34 changes: 25 additions & 9 deletions src/collector/mod.rs
Expand Up @@ -172,17 +172,33 @@ pub trait Collector: Sync + Send {
) -> crate::Result<<Self::Child as SegmentCollector>::Fruit> {
let mut segment_collector = self.for_segment(segment_ord as u32, reader)?;

if let Some(alive_bitset) = reader.alive_bitset() {
weight.for_each(reader, &mut |doc, score| {
if alive_bitset.is_alive(doc) {
match (reader.alive_bitset(), self.requires_scoring()) {
(Some(alive_bitset), true) => {
weight.for_each(reader, &mut |doc, score| {
if alive_bitset.is_alive(doc) {
segment_collector.collect(doc, score);
}
})?;
}
(Some(alive_bitset), false) => {
weight.for_each_no_score(reader, &mut |doc| {
if alive_bitset.is_alive(doc) {
segment_collector.collect(doc, 0.0);
}
})?;
}
(None, true) => {
weight.for_each(reader, &mut |doc, score| {
segment_collector.collect(doc, score);
}
})?;
} else {
weight.for_each(reader, &mut |doc, score| {
segment_collector.collect(doc, score);
})?;
})?;
}
(None, false) => {
weight.for_each_no_score(reader, &mut |doc| {
segment_collector.collect(doc, 0.0);
})?;
}
}

Ok(segment_collector.harvest())
}
}
Expand Down
2 changes: 1 addition & 1 deletion src/indexer/index_writer.rs
Expand Up @@ -95,7 +95,7 @@ fn compute_deleted_bitset(
// document that were inserted before it.
delete_op
.target
.for_each(segment_reader, &mut |doc_matching_delete_query, _| {
.for_each_no_score(segment_reader, &mut |doc_matching_delete_query| {
if doc_opstamps.is_deleted(doc_matching_delete_query, delete_op.opstamp) {
alive_bitset.remove(doc_matching_delete_query);
might_have_changed = true;
Expand Down
20 changes: 19 additions & 1 deletion src/query/boolean_query/boolean_weight.rs
Expand Up @@ -5,7 +5,7 @@ use crate::postings::FreqReadingOption;
use crate::query::explanation::does_not_match;
use crate::query::score_combiner::{DoNothingCombiner, ScoreCombiner};
use crate::query::term_query::TermScorer;
use crate::query::weight::{for_each_pruning_scorer, for_each_scorer};
use crate::query::weight::{for_each_docset, for_each_pruning_scorer, for_each_scorer};
use crate::query::{
intersect_scorers, EmptyScorer, Exclude, Explanation, Occur, RequiredOptionalScorer, Scorer,
Union, Weight,
Expand Down Expand Up @@ -219,6 +219,24 @@ impl<TScoreCombiner: ScoreCombiner + Sync> Weight for BooleanWeight<TScoreCombin
Ok(())
}

fn for_each_no_score(
&self,
reader: &SegmentReader,
callback: &mut dyn FnMut(DocId),
) -> crate::Result<()> {
let scorer = self.complex_scorer(reader, 1.0, &self.score_combiner_fn)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does &DoNothingCombiner make sense here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, but it's not called anyway

PSeitz marked this conversation as resolved.
Show resolved Hide resolved
match scorer {
SpecializedScorer::TermUnion(term_scorers) => {
let mut union_scorer = Union::build(term_scorers, &self.score_combiner_fn);
for_each_docset(&mut union_scorer, callback);
}
SpecializedScorer::Other(mut scorer) => {
for_each_docset(scorer.as_mut(), callback);
}
}
Ok(())
}

/// Calls `callback` with all of the `(doc, score)` for which score
/// is exceeding a given threshold.
///
Expand Down
14 changes: 13 additions & 1 deletion src/query/term_query/term_weight.rs
Expand Up @@ -5,7 +5,7 @@ use crate::fieldnorm::FieldNormReader;
use crate::postings::SegmentPostings;
use crate::query::bm25::Bm25Weight;
use crate::query::explanation::does_not_match;
use crate::query::weight::for_each_scorer;
use crate::query::weight::{for_each_docset, for_each_scorer};
use crate::query::{Explanation, Scorer, Weight};
use crate::schema::IndexRecordOption;
use crate::{DocId, Score, Term};
Expand Down Expand Up @@ -56,6 +56,18 @@ impl Weight for TermWeight {
Ok(())
}

/// Iterates through all of the document matched by the DocSet
/// `DocSet` and push the scored documents to the collector.
fn for_each_no_score(
&self,
reader: &SegmentReader,
callback: &mut dyn FnMut(DocId),
) -> crate::Result<()> {
let mut scorer = self.specialized_scorer(reader, 1.0)?;
for_each_docset(&mut scorer, callback);
Ok(())
}

/// Calls `callback` with all of the `(doc, score)` for which score
/// is exceeding a given threshold.
///
Expand Down
28 changes: 25 additions & 3 deletions src/query/weight.rs
@@ -1,10 +1,10 @@
use super::Scorer;
use crate::core::SegmentReader;
use crate::query::Explanation;
use crate::{DocId, Score, TERMINATED};
use crate::{DocId, DocSet, Score, TERMINATED};

/// Iterates through all of the document matched by the DocSet
/// `DocSet` and push the scored documents to the collector.
/// Iterates through all of the documents and scores matched by the DocSet
/// `DocSet`.
pub(crate) fn for_each_scorer<TScorer: Scorer + ?Sized>(
scorer: &mut TScorer,
callback: &mut dyn FnMut(DocId, Score),
Expand All @@ -16,6 +16,16 @@ pub(crate) fn for_each_scorer<TScorer: Scorer + ?Sized>(
}
}

/// Iterates through all of the documents matched by the DocSet
/// `DocSet`.
pub(crate) fn for_each_docset<T: DocSet + ?Sized>(docset: &mut T, callback: &mut dyn FnMut(DocId)) {
let mut doc = docset.doc();
while doc != TERMINATED {
callback(doc);
doc = docset.advance();
}
}

/// Calls `callback` with all of the `(doc, score)` for which score
/// is exceeding a given threshold.
///
Expand Down Expand Up @@ -78,6 +88,18 @@ pub trait Weight: Send + Sync + 'static {
Ok(())
}

/// Iterates through all of the document matched by the DocSet
/// `DocSet` and push the scored documents to the collector.
fn for_each_no_score(
&self,
reader: &SegmentReader,
callback: &mut dyn FnMut(DocId),
) -> crate::Result<()> {
let mut docset = self.scorer(reader, 1.0)?;
for_each_docset(docset.as_mut(), callback);
Ok(())
}

/// Calls `callback` with all of the `(doc, score)` for which score
/// is exceeding a given threshold.
///
Expand Down