Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPC code writes data with insufficient alignment #5553

Closed
hzuo opened this issue Mar 25, 2024 · 2 comments · Fixed by #5554
Closed

IPC code writes data with insufficient alignment #5553

hzuo opened this issue Mar 25, 2024 · 2 comments · Fixed by #5554
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate bug

Comments

@hzuo
Copy link
Contributor

hzuo commented Mar 25, 2024

Describe the bug

FileWriter and StreamWriter should ensure that the data is written with appropriate alignment such that arrays can be used without copying to a more-aligned buffer.

In particular, as of Rust 1.77.0 and LLVM 18, i128 now has a 16-byte alignment requirement even on x86 (ARM always had this requirement), i.e. std::mem::align_of::<i128> == 16. So Decimal128Arrays must be aligned to a 16-byte boundary when serialized into an IPC buffer. The pad_to_8 used everywhere in the IPC code causes it to pad insufficiently.

This prevents readers of the IPC data generated by this crate from doing true zero-copy reads (e.g. mmapping) since the data may be insufficiently aligned.

On some platforms, SIMD may also be significantly slower if the beginning of the IPC block isn't aligned to a 16-, 32-, or 64- byte boundary (as discussed in the Arrow spec document).

To Reproduce

See the test test_decimal128_alignment8_is_unaligned in PR #5554 - the fact that this test throws an error shows that alignment is not currently respected.

Expected behavior

See the test test_decimal128_alignment16 in PR #5554 - increasing alignment should allow us to do "true" zero-copy reads.

Additional context

IpcWriteOptions already has an "alignment" field but it is not being respected throughout the IPC code.

Related PRs and issues:

@tustvold
Copy link
Contributor

label_issue.py automatically added labels {'arrow'} from #5554

@tustvold tustvold added the arrow-flight Changes to the arrow-flight crate label Apr 17, 2024
@tustvold
Copy link
Contributor

label_issue.py automatically added labels {'arrow-flight'} from #5554

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrow Changes to the arrow crate arrow-flight Changes to the arrow-flight crate bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants