New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support SET
variable
#4069
support SET
variable
#4069
Conversation
LogicalPlan::SetVariable(SetVariable { | ||
variable, value, .. | ||
}) => { | ||
let config_options = &self.state.write().config.config_options; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better refactor this part in the follow on pr once #3909 merged
let old_value = | ||
config_options.read().get(&variable).ok_or_else(|| { | ||
DataFusionError::Execution(format!( | ||
"Unknown Variable {}", | ||
variable | ||
)) | ||
})?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return error if variable isn't in config_options
match old_value { | ||
ScalarValue::Boolean(_) => { | ||
let new_value = value.parse::<bool>().map_err(|_| { | ||
DataFusionError::Execution(format!( | ||
"Failed to parse {} as bool", | ||
value, | ||
)) | ||
})?; | ||
config_options.write().set_bool(&variable, new_value); | ||
} | ||
|
||
ScalarValue::UInt64(_) => { | ||
let new_value = value.parse::<u64>().map_err(|_| { | ||
DataFusionError::Execution(format!( | ||
"Failed to parse {} as u64", | ||
value, | ||
)) | ||
})?; | ||
config_options.write().set_u64(&variable, new_value); | ||
} | ||
|
||
ScalarValue::Utf8(_) => { | ||
let new_value = value.parse::<String>().map_err(|_| { | ||
DataFusionError::Execution(format!( | ||
"Failed to parse {} as String", | ||
value, | ||
)) | ||
})?; | ||
config_options.write().set_string(&variable, new_value); | ||
} | ||
|
||
_ => { | ||
return Err(DataFusionError::Execution( | ||
"Unsupported Scalar Value Type".to_string(), | ||
)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure whether it's a good way to check the data type. ideally we should use DataType defined in ConfigDefinition
https://github.com/apache/arrow-datafusion/blob/97f2e4fd5517c762b0862d22b81f957db511e22e/datafusion/core/src/config.rs#L72-L80
but this information is gone once it's been converted to ConfigOptions
https://github.com/apache/arrow-datafusion/blob/97f2e4fd5517c762b0862d22b81f957db511e22e/datafusion/core/src/config.rs#L279-L281
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think checking the existing type is a reasonable approach
#[tokio::test] | ||
async fn set_time_zone() { | ||
// we don't support changing time zone for now until all time zone issues fixed and related function completed | ||
|
||
let ctx = SessionContext::new(); | ||
|
||
// for full variable name | ||
let err = plan_and_collect(&ctx, "set datafusion.execution.time_zone = '8'") | ||
.await | ||
.unwrap_err(); | ||
|
||
assert_eq!( | ||
err.to_string(), | ||
"Error during planning: Changing Time Zone isn't supported yet" | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should get back to this once time zone is fully integrated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if local { | ||
return Err(DataFusionError::NotImplemented( | ||
"LOCAL is not supported".to_string(), | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't support SET LOCAL VARIABLE TO VALUE
if hivevar { | ||
return Err(DataFusionError::NotImplemented( | ||
"HIVEVAR is not supported".to_string(), | ||
)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't support HIVEVAR. I'm not sure what it is
if variable_lower == "timezone" || variable_lower == "time.zone" { | ||
// we could introduce alias in OptionDefinition if this string matching thing grows | ||
variable_lower = "datafusion.execution.time_zone".to_string(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alias timezone
and time zone
note that time zone
is converted to time.zone
during sql parsing
|
||
// parse value string from Expr | ||
let value_string = match &value[0] { | ||
SQLExpr::Identifier(i) => i.to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to support non-whitespace-separated string
Value::SingleQuotedString(s) => s.to_string(), | ||
Value::Number(_, _) | Value::Boolean(_) => v.to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to support SingleQuotedString
, number
and Bool
UnaryOperator::Plus => format!("+{}", expr), | ||
UnaryOperator::Minus => format!("-{}", expr), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to support signed number e.g. +8, -8
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative (for the future) could be to parse the value as an Expr
and then call ExprSimplifier::simplify(expr)
on it during evaluation in SessionContext
.
That would then support sql statements like
set batch_size = 8*1024
I think this way is fine, too
Example of how to do it
https://github.com/apache/arrow-datafusion/blob/10e64dc013ba210ab1f6c2a3c02c66aef4a0e802/datafusion-examples/examples/expr_api.rs#L77-L89
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'm not familiar with this yet, will try it and send a ticket or pr
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is beautiful @waitingkuo -- thank you 🏆
match old_value { | ||
ScalarValue::Boolean(_) => { | ||
let new_value = value.parse::<bool>().map_err(|_| { | ||
DataFusionError::Execution(format!( | ||
"Failed to parse {} as bool", | ||
value, | ||
)) | ||
})?; | ||
config_options.write().set_bool(&variable, new_value); | ||
} | ||
|
||
ScalarValue::UInt64(_) => { | ||
let new_value = value.parse::<u64>().map_err(|_| { | ||
DataFusionError::Execution(format!( | ||
"Failed to parse {} as u64", | ||
value, | ||
)) | ||
})?; | ||
config_options.write().set_u64(&variable, new_value); | ||
} | ||
|
||
ScalarValue::Utf8(_) => { | ||
let new_value = value.parse::<String>().map_err(|_| { | ||
DataFusionError::Execution(format!( | ||
"Failed to parse {} as String", | ||
value, | ||
)) | ||
})?; | ||
config_options.write().set_string(&variable, new_value); | ||
} | ||
|
||
_ => { | ||
return Err(DataFusionError::Execution( | ||
"Unsupported Scalar Value Type".to_string(), | ||
)) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think checking the existing type is a reasonable approach
#[tokio::test] | ||
async fn set_time_zone() { | ||
// we don't support changing time zone for now until all time zone issues fixed and related function completed | ||
|
||
let ctx = SessionContext::new(); | ||
|
||
// for full variable name | ||
let err = plan_and_collect(&ctx, "set datafusion.execution.time_zone = '8'") | ||
.await | ||
.unwrap_err(); | ||
|
||
assert_eq!( | ||
err.to_string(), | ||
"Error during planning: Changing Time Zone isn't supported yet" | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
UnaryOperator::Plus => format!("+{}", expr), | ||
UnaryOperator::Minus => format!("-{}", expr), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An alternative (for the future) could be to parse the value as an Expr
and then call ExprSimplifier::simplify(expr)
on it during evaluation in SessionContext
.
That would then support sql statements like
set batch_size = 8*1024
I think this way is fine, too
Example of how to do it
https://github.com/apache/arrow-datafusion/blob/10e64dc013ba210ab1f6c2a3c02c66aef4a0e802/datafusion-examples/examples/expr_api.rs#L77-L89
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
@alamb thank you for the review, learned a lot from it. |
Benchmark runs are scheduled for baseline = 761e167 and contender = c9442ce. c9442ce is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
* support SET * remove useless comment * add test cases * Update datafusion/core/src/config.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * Update datafusion/core/src/execution/context.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> * fix test cases * fmt * Update datafusion/expr/src/logical_plan/plan.rs Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org> Co-authored-by: Andrew Lamb <andrew@nerdnetworks.org>
Which issue does this PR close?
Closes #4067
Rationale for this change
to support
What changes are included in this PR?
support u64 value
support bool
support single quoted string
support alias
TIME ZONE
andTIMEZONE
as discussed in #3148 (comment)
we disallow
set timezone
for now until timezone integration is full completedthrow error for unknown variable
throw error for incorrect data type
Are there any user-facing changes?