Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix UB in RawStorageMut::swap_unchecked_linear #1317

Merged
merged 6 commits into from
Mar 28, 2024

Conversation

yotamofek
Copy link
Contributor

According to Miri, RawStorageMut::swap_unchecked_linear introduces a stacked borrows violation:

➜  nalgebra git:(dev) ✗ cargo miri test -- edition::swap
Preparing a sysroot for Miri (target: x86_64-unknown-linux-gnu)... done
   Compiling nalgebra-macros v0.2.1 (/home/yotam/nalgebra/nalgebra-macros)
   Compiling nalgebra v0.32.3 (/home/yotam/nalgebra)

     Running tests/lib.rs (target/miri/x86_64-unknown-linux-gnu/debug/deps/lib-85cd9506b7860f9a)

running 2 tests
test core::edition::swap_columns ... error: Undefined Behavior: attempting a read access using <268833> at alloc82401[0x18], but that tag does not exist in the borrow stack for this location
    --> /home/yotam/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:878:9
     |
878  |         copy_nonoverlapping(x, tmp.as_mut_ptr(), 1);
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |         |
     |         attempting a read access using <268833> at alloc82401[0x18], but that tag does not exist in the borrow stack for this location
     |         this error occurs as part of an access at alloc82401[0x18..0x20]
     |
     = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <268833> was created by a SharedReadWrite retag at offsets [0x0..0x78]
    --> /home/yotam/nalgebra/src/base/array_storage.rs:141:9
     |
141  |         self.0.as_mut_ptr() as *mut T
     |         ^^^^^^^^^^^^^^^^^^^
help: <268833> was later invalidated at offsets [0x0..0x78] by a Unique function-entry retag inside this call
    --> /home/yotam/nalgebra/src/base/storage.rs:200:17
     |
200  |         let b = self.get_address_unchecked_linear_mut(i2);
     |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     = note: BACKTRACE (of the first span):
     = note: inside `std::ptr::swap::<f64>` at /home/yotam/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:878:9: 878:52
note: inside `<na::ArrayStorage<f64, 3, 5> as na::RawStorageMut<f64, na::Const<3>, na::Const<5>>>::swap_unchecked_linear`
    --> /home/yotam/nalgebra/src/base/storage.rs:202:9
     |
202  |         ptr::swap(a, b);
     |         ^^^^^^^^^^^^^^^
note: inside `<na::ArrayStorage<f64, 3, 5> as na::RawStorageMut<f64, na::Const<3>, na::Const<5>>>::swap_unchecked`
    --> /home/yotam/nalgebra/src/base/storage.rs:214:9
     |
214  |         self.swap_unchecked_linear(lid1, lid2)
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `na::Matrix::<f64, na::Const<3>, na::Const<5>, na::ArrayStorage<f64, 3, 5>>::swap_unchecked`
    --> /home/yotam/nalgebra/src/base/matrix.rs:1201:9
     |
1201 |         self.data.swap_unchecked(row_cols1, row_cols2)
     |         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `na::base::edition::<impl na::Matrix<f64, na::Const<3>, na::Const<5>, na::ArrayStorage<f64, 3, 5>>>::swap_columns`
    --> /home/yotam/nalgebra/src/base/edition.rs:331:26
     |
331  |                 unsafe { self.swap_unchecked((i, icol1), (i, icol2)) }
     |                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `core::edition::swap_columns`
    --> tests/core/edition.rs:208:5
     |
208  |     m.swap_columns(1, 3);
     |     ^^^^^^^^^^^^^^^^^^^^
note: inside closure
    --> tests/core/edition.rs:197:18
     |
195  | #[test]
     | ------- in this procedural macro expansion
196  | #[rustfmt::skip]
197  | fn swap_columns() {
     |                  ^
     = note: this error originates in the attribute macro `test` (in Nightly builds, run with -Z macro-backtrace for more info)

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to previous error

error: test failed, to rerun pass `--test lib`

Caused by:
  process didn't exit successfully: `/home/yotam/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/bin/cargo-miri runner /home/yotam/nalgebra/target/miri/x86_64-unknown-linux-gnu/debug/deps/lib-85cd9506b7860f9a 'edition::swap'` (exit status: 1)
note: test exited abnormally; to see the full output pass --nocapture to the harness.

My fix is possibly over-conservative, but I couldn't think of anything else that wouldn't be a breaking change. Granted, for a user to be affected by the change would require that they implemented the unsafe RawStorageMut trait on their own type, and overrode the default get_address_unchecked_linear_mut in a weird way. Still, I think it's better to err on the cautious side.

(BTW, a lot of the tests in edition.rs are currently crashing under miri due to UB, I'm investigating how to fix the rest of the issues but they seem less straight-forward than this one)

@yotamofek yotamofek force-pushed the swap-unchecked-ub branch 2 times, most recently from 13d962f to cb14390 Compare November 14, 2023 18:53
Copy link
Collaborator

@tpdickso tpdickso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 199 to 231
// we can't just use the pointers returned from `get_address_unchecked_linear_mut` because calling a
// method taking self mutably invalidates any existing (mutable) pointers. since `get_address_unchecked_linear_mut` can
// also be overriden by a custom implementation, we can't just use `wrapping_add` assuming that's what the method does.
// instead, we use `offset_from` to compute the re-calculate the pointers from the base pointer.
// this is safe as long as this trait is implemented safely
// (and it's the caller's responsibility to ensure the indices are in-bounds).
let base = self.ptr_mut();
let offset1 = self.get_address_unchecked_linear_mut(i1).offset_from(base);
let offset2 = self.get_address_unchecked_linear_mut(i2).offset_from(base);

let base = self.ptr_mut();
let a = base.offset(offset1);
let b = base.offset(offset2);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, I don't think this can be guaranteed to work for any possible custom get_address_unchecked_linear_mut implementation, either. Imagine a storage that rearranges its elements in memory every time ptr_mut is invoked -- the offsets computed in offset1 and offset2 are no longer valid by the time base is computed the second time, and a and b are now pointing to the wrong elements. This would be a ridiculous way to implement your storage but it means that this is still not a sound way to implement this method by default for any arbitrary storage, it's just one that MIRI doesn't realize to complain about.

But I think this is still fine; my reasoning is, if the previous implementation didn't contain valid Rust semantics before, then we can't really have a "breaking change" on the internals by making it work for some-but-not-all cases. It's a "breaking change" at the API level that we might now have to document the safety preconditions for this to be a valid implementation of the method for a custom user-defined type -- but that should have been there before as well (it turns out the correct set of preconditions was previously ⊥.)

Now since the set of implementations for which this code will generate machine code that happens to work for an arbitrary use-case is not a subset of those which previously happened to generate working machine code, we may still wish to include this in a version bump, rather than a bugfix. But I think a minor version bump would probably be fine? Just realistically I suspect you'd have to work really hard to produce an implementation of RawStorageMut that worked with the previous default implementation but not the current one...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I completely agree with your reasoning.
The only way this would break is if someone really went out of their way to implement a cursed custom storage, and I don't think many people are implementing custom storages at all.
Committed your suggestion below.

Thanks for the review and sorry for the delay in getting to it - must've missed the GH notification 😅.

src/base/storage.rs Show resolved Hide resolved
yotamofek and others added 3 commits March 19, 2024 16:39
@yotamofek yotamofek requested a review from tpdickso March 20, 2024 16:41
src/base/storage.rs Outdated Show resolved Hide resolved
src/base/storage.rs Outdated Show resolved Hide resolved
Comment on lines +225 to +227
let base = self.ptr_mut();
let offset1 = self.get_address_unchecked_linear_mut(i1).offset_from(base);
let offset2 = self.get_address_unchecked_linear_mut(i2).offset_from(base);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of it -- is this technically violating stacked borrows, as well? ptr_mut and get_address_unchecked_linear_mut both accept &mut self and so should also invalidate the previous references, right? I suspect the "correct" way to do this would be to have immutable versions of the trait methods that can be used to compute this validly.

But barring that, we'd need to make sure the borrows don't overlap. Since we need both the base pointer and the element pointer to be alive at the same time, that's a nonstarter. So perhaps integer arithmetic is the escape hatch;

Suggested change
let base = self.ptr_mut();
let offset1 = self.get_address_unchecked_linear_mut(i1).offset_from(base);
let offset2 = self.get_address_unchecked_linear_mut(i2).offset_from(base);
let base = self.ptr_mut() as isize;
let offset1 = self.get_address_unchecked_linear_mut(i1) as isize - base;
let offset2 = self.get_address_unchecked_linear_mut(i2) as isize - base;

I'm not sure if this is equally-invalid though. It might just be tricking MIRI into not realizing we're doing something we're not meant to be doing in the first place. I'm not familiar enough with MIRI to know, myself. Does the current commit satisfy MIRI's checker?

But maybe I'm missing the point, and doing all this with mutable pointers and mutable borrows is fine as long as they're never dereferenced in an invalid state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You said it yourself - calling methods with a &mut self receiver invalidates any previous references. Pointers are not references, and are in themselves an escape hatch from the stacked borrows model, since pointers don't constitute a borrow. 😊 My code causes the swap tests to pass under MIRI so I'm pretty confident they're okay.

OTOH, ptr->int casts are almost always a bad idea, since that causes the loss of provenance. (it might not matter here - but I'm not much of an expert, I mostly rely on MIRI to tell me if I'm misbehaving)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original code also operated on pointers, so I suspect MIRI is smart enough to extend its analysis to pointers if it was nevertheless managing to raise an error there. But the original error message says "read access" which leads me to believe that probably the issue was that the pointers were dereferenced after being invalidated, rather than just that they were used, so it's reasonable that MIRI's okay with this code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right you are, my last comment is just completely wrong, was not thinking straight. 😅 Pointers are an escape hatch from the borrow checker, but their usage should still uphold (stacked) borrowing rules.
The reason my PR appeases MIRI is because we get a "fresh" base from self.ptr_mut() just below this comment, and we don't use any of the already-invalidated pointers from before that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, your version with the ptrtoint casts also appeases MIRI, after a minor correction. It should look like this:

let base = self.ptr_mut() as isize;
let offset1 = (self.get_address_unchecked_linear_mut(i1) as isize - base) / (size_of::<T>() as isize);
let offset2 = (self.get_address_unchecked_linear_mut(i2) as isize - base) / (size_of::<T>() as isize);

(notice the div by size_of::<T>() that you were missing)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right; or rather, we do "use" an invalidated pointer here:

self.get_address_unchecked_linear_mut(i1).offset_from(base);

Since this uses base after invalidating it with a call to get_address_unchecked_linear_mut, but it doesn't dereference it, which I suspect is why MIRI is fine with it. It just uses it as basically an integer.

Co-authored-by: tpdickso <terence.dickson.prf@gmail.com>
@tpdickso tpdickso requested a review from Ralith March 21, 2024 03:00
@tpdickso tpdickso merged commit 749a9fe into dimforge:dev Mar 28, 2024
11 checks passed
@yotamofek yotamofek deleted the swap-unchecked-ub branch March 28, 2024 06:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants