Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++] Introducing Row format for faster sort? #35778

Closed
power1628 opened this issue May 26, 2023 · 3 comments
Closed

[C++] Introducing Row format for faster sort? #35778

power1628 opened this issue May 26, 2023 · 3 comments

Comments

@power1628
Copy link

Describe the enhancement requested

In arrow-rs, there is a row representation of a table. And converting to row format will bring 3x performance benefits on sorting. Is there any plan to apply this optimization to arrow-cpp?

[1] https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-1/
[2] apache/arrow-rs#2929

Component(s)

C++

@raulcd raulcd changed the title Introducing Row format for faster sort? [C++] Introducing Row format for faster sort? May 26, 2023
@mapleFU
Copy link
Member

mapleFU commented May 26, 2023

Personally I'm +1 with this, DuckDB also introduce Join using Row: https://duckdb.org/2021/08/27/external-sorting.html

@westonpace Would you mind take a look?

@power1628
Copy link
Author

Seems arrow already has a row representation [1] , close this issue for now.

[1]https://github.com/apache/arrow/blob/1951a1ae69590ad58d97f6be929fa14485f81f42/cpp/src/arrow/compute/row/row_internal.h

@westonpace
Copy link
Member

westonpace commented Jun 8, 2023

Yes, row_internal.h contains a row representation that can be used if needed. It is what is in use in the hash join. However, it is not used by any sort algorithm. It's not clear the cost would be worth it for an in-memory sort and we do not have an external/spilling sort.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants