Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The plan for provenance #2133

Closed
6 tasks done
RalfJung opened this issue May 20, 2022 · 12 comments · Fixed by #2275
Closed
6 tasks done

The plan for provenance #2133

RalfJung opened this issue May 20, 2022 · 12 comments · Fixed by #2275
Labels
A-intptrcast Area: affects int2ptr and ptr2int casts C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps

Comments

@RalfJung
Copy link
Member

RalfJung commented May 20, 2022

#2059 is nearing completion, so I figured it would make sense to lay down my current plan for the future of provenance in Miri.

My idea is that, by default, we will work in "permissive" provenance mode, implementing as good as we can the angelic from_exposed_addr semantics for int2ptr cast. This is different from now: pointers created in this way get a "wildcard" provenance, and on each access we check if that access happens in the range of some previously exposed provenance. This resolves problems like #1866. There also will be no more "untagged" in Stacked Borrows, but instead Stacked Borrows will have support for the "wildcard" provenance and track which tags have been exposed. That should, hopefully, be much less confusing than the current default. (In particular, if you do not use from_expose_addr/int2ptr casts, it is equivalent to -Zmiri-tag-raw-pointers.) Meanwhile, if you use addr/ptr::invalid, then the restrictions documented in their spec are actually enforced.

When the program actually executes an int2ptr cast or calls from_exposed_addr, a warning is shown:

warning: this program is using integer-to-pointer casts. This is equivalent to calling from_exposed_addr, which means that Miri might miss pointer bugs in this program.
See https://doc.rust-lang.org/nightly/std/ptr/fn.from_exposed_addr.html for more details on that operation.
To ensure that Miri does not miss bugs in your program, use with_addr instead; see https://doc.rust-lang.org/nightly/std/ptr/index.html#strict-provenance for more details. You can then pass the -Zmiri-strict-provenance flag to Miri, to ensure you are not relying on from_exposed_addr semantics.
To silence this warning, pass the -Zmiri-permissive-provenance flag to Miri.

-Zmiri-permissive-provenance literally just disables the warning.

-Zmiri-strict-provenance turns expose_addr into an alias for addr (i.e., it no longer tracks which pointers have been exposed), and it turns ptr::from_exposed_addr into an alias for ptr::invalid (i.e., the pointers produced that way cause UB when being dereferenced, except for ZST accesses). Or maybe it should just make ptr::from_exposed_addr a hard error? That would be a very clear story ("just make the above warning an error"), but it means code that uses from_exposed_addr but then doesn't actually dereference the pointer stops working. But is there any reason to have such code? I say we try this, and see if it causes any trouble.

What I am not entirely sure about is what to do with ptr-to-int transmutes (via mem::transmute or other means of type punning). With -Zmiri-strict-provenance they should trigger UB. They should probably also trigger UB by default because we do have to consider these programs buggy. -Zmiri-permissive-provenance is meant to be a sound flag so it should not enable them either. For now, we have the -Zmiri-allow-ptr-int-transmute flag to allow them, but not everything that you think might work will work. (In particular, doing a bytewise copy at type u8 will never work, that just requires way too deep changes in the interpreter.)

TODO:

@RalfJung RalfJung added the C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps label May 20, 2022
@RalfJung
Copy link
Member Author

What I am not entirely sure about is what to do with ptr-to-int transmutes

I guess the alternative would be to just have a dedicated flag for them. That is probably best, though I am a bit worried that we are getting too many flags and people will miss the most relevant ones in a soup of flags.

@oli-obk
Copy link
Contributor

oli-obk commented May 21, 2022

Maybe we should have a single -Zmiri-recommended that activates all flags we have the opinion of "you should use this" and hide all the fine grained flags in a separate list for "advanced twiddling"

@RalfJung
Copy link
Member Author

RalfJung commented May 21, 2022

Maybe we should have a single -Zmiri-recommended that activates all flags we have the opinion of "you should use this"

I think that whatever opinion we have about what people should use, should just be the default.

But I could imagine splitting the list in the docs into two: common options, and advanced options.

@RalfJung
Copy link
Member Author

So then maybe the story should be quite simply like this: by default, we properly implement as best as we can the expose_addr/ptr::from_exposed_addr semantics (including making these functions different from addr/ptr::invalid). int2ptr and ptr2int are equivalent to ptr::from_exposed_addr/expose_addr. We show a warning on ptr::from_exposed_addr, since Miri might miss bugs in programs that use this feature.

-Zmiri-permissive-provenance silences the warning. -Zmiri-strict-provenance turns the warning into a hard error (and as an optimization, also stops tracking which pointers are exposed, since that now no longer makes a difference). And that's it.

@RalfJung
Copy link
Member Author

RalfJung commented May 23, 2022

With #2059 and rust-lang/rust#97219 having landed, we can now at least error for code like this:

fn main() {
    let x = 42;
    let xptr = &x as *const i32;
    let xptr_invalid = std::ptr::invalid::<i32>(xptr.expose_addr());
    let _val = unsafe { *xptr_invalid }; //~ ERROR is not a valid pointer
}

#2151 implements that. This distinguishes ptr::invalid and ptr::from_exposed_addr.

Distinguishing addr and exposed_addr will take a bit more work. Anyone up for adding an intrinsic so that Miri can treat these differently? :D

@RalfJung
Copy link
Member Author

RalfJung commented May 26, 2022

Alternatively, we might not even need the intrinsic if we make ptr2int transmutes strip provenance. Then addr could be implemented via transmute, in nice symmetry with ptr::invalid.

This will take a bit of work to implement in Miri though. OTOH that work is totally worth it since we can then also use it to reset padding to uninit on a typed copy!

@RalfJung
Copy link
Member Author

Alternatively, we might not even need the intrinsic if we make ptr2int transmutes strip provenance. Then addr could be implemented via transmute, in nice symmetry with ptr::invalid.

I think this is what we should go for.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 3, 2022

Turns out that for addr to not expose provenance, we do not have to entirely resolve #2182. Something like rust-lang/rust#97684 is enough. :)

So, we are pretty close to having permissive provenance work as intended (properly distinguishing both addr/expose_addr and invalid/from_exposed_addr). What's left is Stacked Borrows support. :D

@RalfJung RalfJung changed the title The plan for provenance The plan for permissive provenance Jun 3, 2022
@RalfJung RalfJung added the A-intptrcast Area: affects int2ptr and ptr2int casts label Jun 5, 2022
bors added a commit to rust-lang-ci/rust that referenced this issue Jun 6, 2022
implement ptr.addr() via transmute

As per the discussion in rust-lang/unsafe-code-guidelines#286, the semantics for ptr-to-int transmutes that we are going with for now is to make them strip provenance without exposing it. That's exactly what `ptr.addr()` does! So we can implement `ptr.addr()` via `transmute`. This also means that once rust-lang#97684 lands, Miri can distinguish `ptr.addr()` from `ptr.expose_addr()`, and the following code will correctly be called out as having UB (if permissive provenance mode is enabled, which will become the default once the [implementation is complete](rust-lang/miri#2133)):

```rust
fn main() {
    let x: i32 = 3;
    let x_ptr = &x as *const i32;

    let x_usize: usize = x_ptr.addr();
    // Cast back an address that did *not* get exposed.
    let ptr = std::ptr::from_exposed_addr::<i32>(x_usize);
    assert_eq!(unsafe { *ptr }, 3); //~ ERROR Undefined Behavior: dereferencing pointer failed
}
```

This completes the Miri implementation of the new distinctions introduced by strict provenance. :)

Cc `@Gankra` -- for now I left in your `FIXME(strict_provenance_magic)` saying these should be intrinsics, but I do not necessarily agree that they should be. Or if we have an intrinsic, I think it should behave exactly like the `transmute` does, which makes one wonder why the intrinsic should be needed.
@RalfJung RalfJung changed the title The plan for permissive provenance The plan for provenance Jun 6, 2022
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Jun 8, 2022
Allow ptr_from_addr_cast to fail

This is needed for rust-lang/miri#2133: I would like to have an option in Miri to error when a int2ptr cast is executed.
Dylan-DPC added a commit to Dylan-DPC/rust that referenced this issue Jun 8, 2022
Allow ptr_from_addr_cast to fail

This is needed for rust-lang/miri#2133: I would like to have an option in Miri to error when a int2ptr cast is executed.
@RalfJung
Copy link
Member Author

#2275 completes this plan. :)

@saethlin
Copy link
Member

I think previous SB with Untagged didn't permit noalias on &mut. For that reason, I was hesitant to run Miri a lot without -Zmiri-tag-raw-pointers.

Does the fact that permissive provenance with Wildcard enable raw pointer tagging mean that it permits noalias on &mut?

@RalfJung
Copy link
Member Author

RalfJung commented Jun 27, 2022

I think that once #2275 lands, all Miri modes (that are not explicitly marked as "unsound" in the docs) are sound wrt. noalias on &mut. This inevitably has to make some assumptions about how ptr2int and int2ptr casts behave in LLVM, which is not specified in the LangRef.

We still have the difference that rustc also puts noalias on newtypes around &mut, which Miri does not check. But that is a separate concern. That is rust-lang/unsafe-code-guidelines#125.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 27, 2022

Though also note that if you do use int2ptr casts, then Miri is not sound wrt anything. That is just inherent in our spec for those casts. The assumption is that those casts are rare. They are certainly much more rare than use of raw pointers, so we have a lot more precision (fewer false negatives) than the previous default mode. :) The false negatives are limited to memory accesses that directly interact with previously "exposed" memory.

But Miri will tell you the first time such a cast happens in your program. (That warning is disabled with -Zmiri-permissive-provenance.)

@bors bors closed this as completed in 7fafbde Jun 28, 2022
workingjubilee pushed a commit to tcdi/postgrestd that referenced this issue Sep 15, 2022
implement ptr.addr() via transmute

As per the discussion in rust-lang/unsafe-code-guidelines#286, the semantics for ptr-to-int transmutes that we are going with for now is to make them strip provenance without exposing it. That's exactly what `ptr.addr()` does! So we can implement `ptr.addr()` via `transmute`. This also means that once rust-lang/rust#97684 lands, Miri can distinguish `ptr.addr()` from `ptr.expose_addr()`, and the following code will correctly be called out as having UB (if permissive provenance mode is enabled, which will become the default once the [implementation is complete](rust-lang/miri#2133)):

```rust
fn main() {
    let x: i32 = 3;
    let x_ptr = &x as *const i32;

    let x_usize: usize = x_ptr.addr();
    // Cast back an address that did *not* get exposed.
    let ptr = std::ptr::from_exposed_addr::<i32>(x_usize);
    assert_eq!(unsafe { *ptr }, 3); //~ ERROR Undefined Behavior: dereferencing pointer failed
}
```

This completes the Miri implementation of the new distinctions introduced by strict provenance. :)

Cc `@Gankra` -- for now I left in your `FIXME(strict_provenance_magic)` saying these should be intrinsics, but I do not necessarily agree that they should be. Or if we have an intrinsic, I think it should behave exactly like the `transmute` does, which makes one wonder why the intrinsic should be needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-intptrcast Area: affects int2ptr and ptr2int casts C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants