You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Not sure how fast we are compared with duckdb and arrow-rs, but it seems that compared with their old implementation, they get a big speedup.
Using this format for lexicographic sorting is more than 3x faster than the comparator based approach, with the benefits especially pronounced for strings, dictionaries and sorts with large numbers of columns.
We have also already used it to more than double the performance of sort preserving merge in the DataFusion project, and expect similar or greater performance uplift as we apply it to sort, grouping, join, and window function operators as well.
Problem description
Faster multi column sorting by sorting encoded forms of each column.
https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-2/
https://arrow.apache.org/blog/2022/11/07/multi-column-sorts-in-arrow-rust-part-1/
apache/arrow-rs#2929
The text was updated successfully, but these errors were encountered: