Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error pruning IsNull expressions: Column 'instance_null_count' is declared as non-nullable but contains null values #3042

Closed
alamb opened this issue Aug 5, 2022 · 1 comment · Fixed by #3044
Assignees
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented Aug 5, 2022

Describe the bug
A clear and concise description of what the bug is.

To Reproduce

    #[test]
    fn prune_int32_is_null() {
        let (schema, statistics) = int32_setup();

        // Expression "i IS NULL" when there are no null statistics,
        // should all be kept
        let expected_ret = vec![true, true, true, true, true];

        // i IS NULL
        let expr = col("i").is_null();
        let p = PruningPredicate::try_new(expr, schema.clone()).unwrap();
        let result = p.prune(&statistics).unwrap();
        assert_eq!(result, expected_ret);
    }

Actual behavior:

---- physical_optimizer::pruning::tests::prune_int32_is_null stdout ----
thread 'physical_optimizer::pruning::tests::prune_int32_is_null' panicked at 'called `Result::unwrap()` on an `Err` value: Plan("Invalid argument error: Column 'i_null_count' is declared as non-nullable but contains null values")', datafusion/core/src/physical_optimizer/pruning.rs:1776:43
stack backtrace:
   0: rust_begin_unwind
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:584:5
   1: core::panicking::panic_fmt
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:142:14
   2: core::result::unwrap_failed
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/result.rs:1785:5
   3: core::result::Result<T,E>::unwrap
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/result.rs:1078:23
   4: datafusion::physical_optimizer::pruning::tests::prune_int32_is_null
             at ./src/physical_optimizer/pruning.rs:1776:22
   5: datafusion::physical_optimizer::pruning::tests::prune_int32_is_null::{{closure}}
             at ./src/physical_optimizer/pruning.rs:1766:5
   6: core::ops::function::FnOnce::call_once
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/ops/function.rs:248:5
   7: core::ops::function::FnOnce::call_once
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/ops/function.rs:248:5
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

Expected behavior
The test should pass (and the Expr::IsNull predicate can be used)

Additional context
We found this while working on IOx

@alamb alamb added the bug Something isn't working label Aug 5, 2022
@alamb
Copy link
Contributor Author

alamb commented Aug 5, 2022

This appears to be a regression of the code introduced in #1595. It means predicate pruning will not happens if an expression contains an IsNull and the statistics don't provide a NullCount so I am not sure how important it is in practice

apache/arrow-rs#1888 (in arrow 17.0.0) where this error comes from started enforcing that the null declaration was correct , introduced by #2778

It looks like there weren't any tests that found it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant