New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Decimal128
to DataType::is_numeric
and clean the numeric casting code
#2621
Conversation
Signed-off-by: remzi <13716567376yh@gmail.com>
Signed-off-by: remzi <13716567376yh@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally sure about this, as I'm not entirely sure how these methods are being used. But Decimal is distinct from Float32, Int16, etc... in that it isn't a PrimitiveArray. I'm not sure if this is or isn't an issue
Unfortunately, I don't find a definition of which types are numeric in the Arrow format: https://github.com/apache/arrow/blob/master/format/Schema.fbs We have a trait In datafusion, the definition of This PR is trying to provide a consistent definition (whatever the definition is) of |
In the context of DataFusion what is it used to determine? |
The only use case currently is the numerical coercion (https://github.com/apache/arrow-datafusion/blob/master/datafusion/expr/src/binary_rule.rs#L570, and https://github.com/apache/arrow-datafusion/blob/master/datafusion/expr/src/binary_rule.rs#L187) |
Datafusion and arrow-rs are two different system, in the datafusion some cases or operations are not supported now, but that operations maybe supported in the arrow-rs. We just can consider the arrow-rs, and make the consistent in the arrow-rs. |
Thank you, @liukun4515. At least, I think we should rename the |
FWIW, I think this PR is correct and it would be nice to get in the next release. DataFusion should depend on these functions and remove it's own similarly-named functions. I'd be happy to do a follow-on issue for DF to clean that up. In DF, these functions are only used for determining type to type coercion rules, but then it uses arrow-rs's cast functions to do the actual conversion .. shouldn't the coercion rules also be in arrow-rs which also manages the types, the type implementations, and the casting? There's a can_cast_types() which is either what we need already, or a similar function like can_cast_types_without_loss(). |
I think my major reservation with this as it stands is that at least currently the decimal support in arrow-rs is extremely limited, I think once we have all the operations for decimals as other "numerics" I would be more happy with this change. Otherwise I think it is confusing to label decimals as numeric types, when they aren't properly supported? |
What ever happed with this PR? Is there consensus on next steps? |
Converting to draft until we decide on next steps |
Closed by #3121 |
Which issue does this PR close?
Closes #2611.
Rationale for this change
We try to match the
is_numeric
function in the Datafusion (aka, adding Float16 and Decimal128). However, asFloat16
is not supported by some upstream libraries (such as lexical_core::FromLexical), we can't add it now. (This is not a big deal because Float16 is a minor use case). Decimal128 could be added successfully.What changes are included in this PR?
Are there any user-facing changes?