New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resolves #1691: Lucene negative queries #1694
Conversation
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
} | ||
if (filterMask != null && filterMask.getUnsatisfiedFilters().isEmpty()) { | ||
filterMask.setSatisfied(true); | ||
} | ||
if (!negatedChildren.isEmpty()) { | ||
return new LuceneNotQuery(childClauses, negatedChildren); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If someone did something like lucene_clause_a OR NOT lucene_clause_b
, could this transformation lead to this being translated into something like (lucene_clause_a) -(lucene_clause_b)
(effectively turning the OR
to an AND
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not think so. LuceneNotQuery
always has occur = MUST
, so it is not possible for negatedChildren
to be added to in the SHOULD
case. Note how flattening only happens when occur matches.
clauses.addAll(notQuery.getNegatedChildren()); | ||
} | ||
return new LuceneBooleanQuery(clauses, BooleanClause.Occur.SHOULD); | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a missing case here for if occur
is SHOULD
instead of MUST
or MUST_NOT
? I'm a little worried that if we're not careful, this could result in accidentally including (*:*)
in more queries than we'd like, but maybe that's a danger with negative queries more broadly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A boolean query only has SHOULD
or MUST
. Those are the two branches of the if
. NOT
is not represented as one with MUST_NOT
, but rather as MUST
with a set of negated children.
I think this can be made clearer here by using switch
. I did that.
More broadly, I tried a few approaches to reconcile the Boolean algebra trees that queries have with the Boolean queries with occurrence requirements that Lucene has. None of them were entirely satisfactory. One whose behavior was slightly more obvious was where the flattening and the addition of any *:*
happened at bind
time rather than plan time. In the end, though, I decided it was true to the intent of the API to have all such manipulation happen early and to have bind
only be about deferring the concrete representation until the concrete values were known and no other rewriting.
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
At root a new clause class that produces a negated boolean. But since Lucene represents that as another Occur, the clause actually implements set subtraction. The planner tries to maximize the cases where the subtraction is from a positive query rather than just *:*.
SonarCloud Quality Gate failed. |
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
At root a new clause class that produces a negated boolean.
But since Lucene represents that as another
Occur
, the clause actually implements set subtraction.The planner tries to maximize the cases where the subtraction is from a positive query rather than just
*:*
.Includes #1675 because of conflicts.