Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor with Cursor-based API, and passing Miri tests #62

Merged
merged 13 commits into from Jun 27, 2022

Conversation

jamesmunns
Copy link
Collaborator

@jamesmunns jamesmunns commented Jun 24, 2022

This is WIP PR to address miri issues. it builds on @phil-opp's existing miri branch.

Fixes #61
Fixes #60

@jamesmunns
Copy link
Collaborator Author

This first set of changes fixes a couple issues.

The primary issues addressed so far are:

  • The tests should not be blindly casting heap.bottom and assuming that the first Hole will be there. When the data is not aligned, it won't be.
  • Some of the tests assume that the heap that will be created will always be created with an alignment that matches the Hole type. This is probably always practically true, but not in Miri.

To address the former, I added a pub(crate) accessor function for the hole list, rather than casting the pointer. This fixed some Miri warnings about casting with alignment.

To verify the former, I made a little helper type that forces the alignment to some large amount, which made some of the failures go away. This fixed a number of "size isn't as expected", which I think were mostly just not handling the "shrunk heap due to alignment" case (the CODE seems to, but the TESTS seemed not to).

#[repr(align(128))]
struct Chonk<const N: usize> {
    data: [MaybeUninit<u8>; N]
}

This probably isn't the right fix, but it helped verify things. The right fix is to probably either exposing how much of the heap space was lost to alignment somehow.

@jamesmunns
Copy link
Collaborator Author

The second commit also applies the force-alignment trick to the max_stack tests, which now allows all tests to pass in a "standard" miri run:

MIRIFLAGS="-Zmiri-ignore-leaks" cargo +nightly miri test

However, taking it one (or a few?) steps further, the following configuration now fails:

MIRIFLAGS="-Zmiri-tag-raw-pointers -Zmiri-ignore-leaks" cargo +nightly miri test

Digging in further!

@jamesmunns
Copy link
Collaborator Author

So, in this latest commit, I'm now sort of stuck on this issue:

test test::align_from_small_to_big ... error: Undefined Behavior: attempting a write access using <238798> at alloc100033[0x200], but that tag does not exist in the borrow stack for this location
    --> /home/james/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:1342:9
     |
1342 | ...   copy_nonoverlapping(&src as *const T, dst, 1);
     |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     |       |
     |       attempting a write access using <238798> at alloc100033[0x200], but that tag does not exist in the borrow stack for this location
     |       this error occurs as part of an access at alloc100033[0x200..0x210]
     |
     = help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
     = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <238798> was created by a retag at offsets [0x0..0x10]
    --> src/hole.rs:152:19
     |
152  |             addr: self as *mut Hole,
     |                   ^^^^
help: <238798> was later invalidated at offsets [0x8..0x10]
    --> src/hole.rs:276:32
     |
276  | ... = previous.next.as_mut().unwrap().next.take();
     |       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
     = note: inside `core::ptr::write::<hole::Hole>` at /home/james/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mod.rs:1342:9
     = note: inside `core::ptr::mut_ptr::<impl *mut hole::Hole>::write` at /home/james/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ptr/mut_ptr.rs:1406:18
note: inside `hole::allocate_first_fit` at src/hole.rs:293:25
    --> src/hole.rs:293:25
     |
293  | / ...   ptr.write(Hole {
294  | | ...       size: padding.size,
295  | | ...       next: previous.next.take(),
296  | | ...   })
     | |________^

I had to rework a couple of pointer types to get here, but I think there is some kind of issue between split_hole and allocate_first_fit, creating multiple mutable aliases to the same data in some way.

When doing some print debugging, I see the following memory ranges:

fn allocate_first_fit(mut previous: &mut Hole, layout: Layout) -> Result<HoleInfo, ()> {
    loop {
        let allocation: Option<Allocation> = previous
            .next
            .as_mut()
            .and_then(|current| split_hole(current.info(), layout.clone()));
        match allocation {
            Some(allocation) => {
                // link the front/back padding
                // Note that there must be no hole between following pair:
                // previous - front_padding
                // front_padding - back_padding
                // back_padding - previous.next
                let prev = previous as *mut Hole as usize;
                println!("prev: {:016X}..{:016X}", prev, prev + previous.size);
                if let Some(padding) = allocation.back_padding {
                    let back = padding.addr as usize;
                    println!("back: {:016X}..{:016X}", back, back + padding.size);
                }
                if let Some(next) = previous.next.as_mut() {
                    let nxt = (*next) as *mut Hole as usize;
                    println!("next: {:016X}..{:016X}", nxt, nxt + next.size);
                }
                panic!();
---- test::align_from_small_to_big stdout ----
prev: 000000000023B7B0..000000000023B7B0
back: 000000000023CC80..000000000023D048
next: 000000000023CA80..000000000023CE68
thread 'main' panicked at 'explicit panic', src/hole.rs:274:17

I'm not certain, but it looks weird to me that back and next overlap like this. I'll keep digging for a bit longer, but I'd be open to any feedback from you!

@jamesmunns
Copy link
Collaborator Author

Specifically, it looks like mutating previous.next.next via this line of code:

let new_next = previous.next.as_mut().unwrap().next.take();

Is what triggers the miri failure (I think it's the call to take(), which writes to previous.next.next).

@phil-opp
Copy link
Member

Thanks a lot for investigating!

If I remember correctly, the split_hole function does not update the previous.next pointer. So the allocation and the front and back padding is expected to overlap with the next range. We then update previous.next to skip the hole corresponding to allocation:

let new_next = previous.next.as_mut().unwrap().next.take();
previous.next = new_next;

I guess miri is not happy because there is aliasing of *mut across allocation and previous.next. The cleaner solution would probably be to make split_hole take the Hole out of the linked list before returning the allocation.

We probably need to rework the split_hole function anyway because the range end of the back padding should always equal the range end of next, which does not seem the case in your tests.

@phil-opp
Copy link
Member

We probably need to rework the split_hole function anyway because the range end of the back padding should always equal the range end of next, which does not seem the case in your tests.

Ok, this seems to be caused by your recent change to HoleList: 7a03f75#diff-bc1158565eb1de7d4bb32e95497e98fb9f57599f10a10140360598d73ca769f7L159-R161 . By changing addr from *mut u8 to *mut Hole, the wrapping_offset call in https://github.com/phil-opp/linked-list-allocator/blob/7a03f759bc9e8b6a2f8fca01f3db6f77956ad096/src/hole.rs#L228-L230 became incorrect (since offset is in units of size_of::<T>()).

@phil-opp
Copy link
Member

phil-opp commented Jun 25, 2022

I started to rework split_hole in https://github.com/phil-opp/linked-list-allocator/tree/rework-split-hole, based on your commit 23ae87d. It still leads to the same MIRI error as you mentioned above, but at least the pointer access patterns are clearer now (e.g. HoleInfo::addr is never dereferenced anymore).

@jamesmunns
Copy link
Collaborator Author

Nice! I thought I caught all those, the goal of that was to fix the "tried to access something with align 8 via an align 1 pointer", which might have been fixed/masked by the Chonk alignment fixer.

It's probably worth adding forced-misalignment tests in the future, e.g. starting with Chonk, and then skipping 1..align_of(Hole) to make sure everything handles misaligned data correctly.

@jamesmunns
Copy link
Collaborator Author

Hey @phil-opp, I'll likely have a little time later this afternoon to work on this some more, let me know how you want to collab on this.

Mostly let me know if you are "tagging out" of a branch, and I can pick up where you left off.

@phil-opp
Copy link
Member

I don't have time to continue working on this today, so feel free to continue on my branches. I also just send you an invite to the repo to give you write access, then we don't need to synchronize across forks. Thanks again for looking into this!

@jamesmunns
Copy link
Collaborator Author

So uh, @phil-opp, I sort of rewrote the hole module to use a linked list cursor API, which made more sense to me, and avoided mixing and matching references and pointers.

As of this commit, all tests are now passing with tagged pointers and strict provenance rules:

MIRIFLAGS="-Zmiri-disable-isolation -Zmiri-strict-provenance -Zmiri-tag-raw-pointers -Zmiri-ignore-leaks" cargo +nightly miri test
    Finished test [unoptimized + debuginfo] target(s) in 0.00s
     Running unittests src/lib.rs (target/miri/x86_64-unknown-linux-gnu/debug/deps/linked_list_allocator-676c14ab1258310d)

running 17 tests
test hole::test::aff ... ok
test hole::test::cursor ... ok
test test::align_from_small_to_big ... ok
test test::allocate_and_free_double_usize ... ok
test test::allocate_double_usize ... ok
test test::allocate_multiple_sizes ... ok
test test::allocate_usize ... ok
test test::allocate_usize_in_bigger_block ... ok
test test::deallocate_middle ... ok
test test::deallocate_right_before ... ok
test test::deallocate_right_behind ... ok
test test::empty ... ok
test test::extend_empty_heap ... ok
test test::extend_fragmented_heap ... ok
test test::extend_full_heap ... ok
test test::oom ... ok
test test::reallocate_double_usize ... ok

test result: ok. 17 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

   Doc-tests linked_list_allocator

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

I probably should take a crack at the dealloc API, since I think it could be made simpler with the new Cursor API. But first, I wanted to check in with you to see if this is a direction you are comfortable going in. I tried to comment everything pretty extensively, if you have any feedback or questions, I'd be happy to answer them!

@jamesmunns
Copy link
Collaborator Author

I know significant rewrites are bad form - so if you'd like me to walk this back or find a way to make this change more minimal, just let me know! Otherwise I'll try and move the deallocate API to use the cursor API, and then clean up the hacks and try and get this ready to merge.

If this ends up not being immediately (or ever) mergeable, no worries!

@jamesmunns jamesmunns changed the title WIP: Investigate miri issues Refactor with Cursor-based API, and passing Miri tests Jun 26, 2022
@jamesmunns jamesmunns changed the base branch from miri to master June 26, 2022 02:34
@jamesmunns jamesmunns marked this pull request as ready for review June 26, 2022 02:35
@jamesmunns
Copy link
Collaborator Author

Hey @phil-opp, I'm marking this as ready for review, as I've completed the work to switch deallocate over to the Cursor based API.

I know this is a pretty substantial change, let me know if you have any questions, or want to chat about this sync somewhere.

Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks so much @jamesmunns! The code was really easy to follow, also thanks to your great comments. The new assertions on dealloc are a very good idea, too!

As of this commit, all tests are now passing with tagged pointers and strict provenance rules

Wow, even with strict provenance, I did not expect that. 🎉

I know significant rewrites are bad form - so if you'd like me to walk this back or find a way to make this change more minimal, just let me know!

No worries, it's fine :). Most of the code in this crate was written in 2016, long before the stacked borrows and the strict provenance models, so it would probably be difficult to retrofit the old code to the new models.

src/hole.rs Outdated Show resolved Hide resolved
@jamesmunns
Copy link
Collaborator Author

@phil-opp this should be ready to merge, added Miri checks to CI!

@jamesmunns
Copy link
Collaborator Author

One quick note - while investigating whether this fixed #60, I realized that containing a *mut u8 makes the Heap item !Send, where it formerly was. I'll add an unsafe impl, but let me know if you'd like me to back that out.

Copy link
Member

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

@phil-opp phil-opp merged commit ca6309d into rust-osdev:master Jun 27, 2022
@phil-opp
Copy link
Member

Published as v0.10.0

@HeroicKatora
Copy link
Contributor

HeroicKatora commented Jun 27, 2022

Thanks, thank you so much 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Issues running tests in miri Allocating for sizes way beyond the heap size seems buggy
3 participants