Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another Internal error when parquet predicate pushdown is enabled "Error evaluating filter predicate: #4046

Closed
Tracked by #3463 ...
alamb opened this issue Oct 31, 2022 · 0 comments · Fixed by #4048
Closed
Tracked by #3463 ...
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Oct 31, 2022

Describe the bug
DataFusion generates an error for some predicates when predicate pushdown is enabled.

NOTE This is the same symptom as reported on #4006 but with a different predicate

NOTE that pushdown filtering is not enabled by default (as we are still working on it) so this issue will not likely affect users:

To Reproduce

  1. Download data from repro.zip
  2. Run datafusion CLI

The query run is

select count(*) from foo where request_method != 'GET' OR response_status = 400 OR service = 'backend';

I tested is using master at 35f786b, which includes the fix for #4006 in 5cf090a

$ git status
Your branch is up to date with 'apache/master'.

nothing to commit, working tree clean
$ git rev-parse HEAD
5cf090a13391501c0ce7707ac7a1e50e18517b79

Expected behavior
Same answer should be produced with and without row filtering enabled. However, with row filtering an error is produced

datafusion-cli -f script.sql
+-----------------+
| COUNT(UInt8(1)) |
+-----------------+
| 53819           |
+-----------------+
1 row in set. Query took 0.006 seconds.

With it enabled:

DATAFUSION_EXECUTION_PARQUET_PUSHDOWN_FILTERS=true datafusion-cli -f script.sql
...
1 row in set. Query took 0.021 seconds.
ArrowError(ExternalError(Execution("Arrow error: External error: Arrow: underlying Arrow error: Compute error: Error evaluating filter predicate: Internal(\"Cannot evaluate binary expression NotEq with types UInt16 and Utf8\")")))

Additional context
Found by the test here #3976

@alamb alamb added the bug Something isn't working label Oct 31, 2022
tustvold added a commit to tustvold/arrow-datafusion that referenced this issue Oct 31, 2022
alamb pushed a commit that referenced this issue Nov 1, 2022
* Fix multicolumn parquet predicate pushdown (#4046)

* Format
Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this issue Nov 5, 2022
* Fix multicolumn parquet predicate pushdown (apache#4046)

* Format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant