Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote tables: filter without sort #463

Open
backkem opened this issue Oct 19, 2023 · 2 comments
Open

Remote tables: filter without sort #463

backkem opened this issue Oct 19, 2023 · 2 comments

Comments

@backkem
Copy link

backkem commented Oct 19, 2023

I was wondering: does the datafusion_remote_tables filter push-down not support sorting? It seems that using filters and limits in the absence of a sort order could lead to un-expected results.

I'd be happy to help address this if this is indeed the case.

@gruuya
Copy link
Contributor

gruuya commented Oct 24, 2023

Hey @backkem, that's a good question.

We abide by the TableProvider API set out by DataFusion which doesn't take into account the ORDER BY clause:

async fn scan(
&self,
_ctx: &SessionState,
projection: Option<&Vec<usize>>,
filters: &[Expr],
limit: Option<usize>,
) -> Result<Arc<dyn ExecutionPlan>> {

Sorting itself is handled by DataFusion further down the data processing pipeline (i.e. once the data has been fetched) by a plan node above the scanning node in the plan AST.

While in principle filtering and sorting are commutative, the limit doesn't commute with sorting. DataFusion handles this by carefully deciding when to push-down the limit down into the scan (hence why it's an Option<usize>), though I forgot where exactly that occurs.

@backkem
Copy link
Author

backkem commented Oct 27, 2023

Thank you for the feedback. I'll try to find some time to look into the directions mentioned in apache/datafusion#7871.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants