Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AtomicCell: Use atomic-maybe-uninit #1015

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

taiki-e
Copy link
Member

@taiki-e taiki-e commented Aug 13, 2023

(Depend on taiki-e/atomic-maybe-uninit#16.)

This fixes UB when lock-free and the value has uninitialized bytes, by using atomic-maybe-uninit crate.
Fixes #315
Fixes #748

Also, this makes 128-bit atomics lock-free on some platforms.
Closes crossbeam-rs/rfcs#13

Currently, this patch always uses fallback (seqlock) on environments that do not support inline assembly, such as Miri.
It is possible to use atomic-maybe-uninit crate to fix UB of concurrent access with volatile read/write in seqlock/deque (#744, #859), but that is not included in this PR and is left to subsequent PRs.


cc @RalfJung

@taiki-e taiki-e force-pushed the atomic-maybe-uninit branch 6 times, most recently from 16bd406 to dc12e29 Compare August 13, 2023 22:22
@taiki-e
Copy link
Member Author

taiki-e commented Aug 14, 2023

Currently, it is not possible to use MaybeUninit in the input registers of an inline assembly (i.e., when processing MaybeUninit, values must be passed through memory), so this worsens performance.

Before:

test concurrent_load_u8     ... bench:     600,658 ns/iter (+/- 11,986)
test concurrent_load_usize  ... bench:     601,385 ns/iter (+/- 11,719)
test concurrent_store_u8     ... bench:   1,358,933 ns/iter (+/- 192,961)
test concurrent_store_usize  ... bench:   1,356,045 ns/iter (+/- 20,230)

After (pass by memory):

test concurrent_load_u8     ... bench:     868,554 ns/iter (+/- 150,633)
test concurrent_load_usize  ... bench:     851,380 ns/iter (+/- 7,139)
test concurrent_store_u8     ... bench:   2,029,710 ns/iter (+/- 178,712)
test concurrent_store_usize  ... bench:   2,038,341 ns/iter (+/- 196,265)

If I change the atomic-maybe-uninit implementation to pass by value using transmute, performance will be equivalent to the current one, but it is UB when the input value contains uninitialized bytes (because it is transmuted to integer).

After (pass by value):

test concurrent_load_u8     ... bench:     601,504 ns/iter (+/- 12,824)
test concurrent_load_usize  ... bench:     600,052 ns/iter (+/- 15,885)
test concurrent_store_u8     ... bench:   1,357,697 ns/iter (+/- 211,465)
test concurrent_store_usize  ... bench:   1,357,504 ns/iter (+/- 252,722)

It would be nice if the inline assembly could allow MaybeUninit registers as thomcc had previously proposed.

@taiki-e
Copy link
Member Author

taiki-e commented Aug 14, 2023

Filed rust-lang/rust#114790 to allow MaybeUninit in input and output of inline assembly.

@taiki-e taiki-e force-pushed the atomic-maybe-uninit branch 6 times, most recently from df06c0c to f6c99b8 Compare August 15, 2023 11:01
bors added a commit to rust-lang-ci/rust that referenced this pull request Aug 23, 2023
Allow MaybeUninit in input and output of inline assembly

**Motivation:**

As part of the work to remove UBs from crossbeam's AtomicCell, I'm writing a library to implement atomic operations on MaybeUnint using inline assembly ([atomic-maybe-uninit](https://github.com/taiki-e/atomic-maybe-uninit), crossbeam-rs/crossbeam#1015).

However, currently, MaybeUnint cannot be used in input&output of inline assembly, so when processing MaybeUninit, values must be [passed through memory](https://github.com/taiki-e/atomic-maybe-uninit/blob/main/src/arch/aarch64.rs#L121-L122). It is inefficient and microbenchmarks have [actually shown significant performance degradation](crossbeam-rs/crossbeam#1015 (comment)).

It would be nice if we could allow MaybeUninit in input and output of inline assembly.

---

This PR changed the type check in rustc_hir_analysis to allow `MaybeUnint<int | float | ptr | fn ptr | simd vector>` in input and output of inline assembly and added a simple test.

To be honest, I'm not sure that this is the correct way to do it, because this is like doing transmute to integers/floats/etc from MaybeUninit on the compiler side. EDIT: [this seems fine](https://rust-lang.zulipchat.com/#narrow/stream/216763-project-inline-asm/topic/MaybeUninit.20in.20asm!/near/384662900)

r? `@Amanieu`
cc `@thomcc` (because you [had previously proposed this](https://rust-lang.zulipchat.com/#narrow/stream/216763-project-inline-asm/topic/MaybeUninit.20in.20asm!))
@taiki-e taiki-e marked this pull request as ready for review October 1, 2023 18:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants