Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch BufMut::bytes_mut from MaybeUninit to &mut UninitSlice, a type ownec by bytes for this purpose #433

Merged
merged 12 commits into from Oct 19, 2020

Conversation

carllerche
Copy link
Member

@carllerche carllerche commented Oct 16, 2020

The way BufMut uses MaybeUninit can lead to unsoundness.

This replaces MaybeUnit with a type owned by bytes so we can ensure the usage patterns are sound.

Refs: #328

@carllerche
Copy link
Member Author

cc/ @Kixunil

@carllerche carllerche marked this pull request as ready for review October 17, 2020 05:09
Copy link

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I think it'd be helpful to have at least write_slice(&mut self, bytes: &[u8]) and &mut write_byte(byte: u8, index: usize) methods for the "almost safe" code to use.

Also I suggest pinging Ralf once the PR is redy to merge.. He knows unsafe better than me. :)

("almost safe" - the only unsafety is calling advance with proper values)

/// let mut slice = &mut data[..];
/// let ptr = BufMut::bytes_mut(&mut slice).as_ptr();
/// ```
pub fn as_ptr(&self) -> *const u8 {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost useless It can't be read from until the caller wrote to *mut u8 version of (why would the caller use *const if there's *mut already available?) it and it can't be written to.
Address of the pointer can be compared, hashed, debugged...

I think this should be documented somehow.

}

/// Return a raw pointer to the slice's buffer.
///
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest documenting safety:

  • The consumer must not read from the memory behind the pointer
  • The consumer must not write uninit bytes into memory behind the pointer

(also aliasing rules, but that should be obvious)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically, writing uninit bytes into the memory should be fine if the caller does not advance accordingly.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not if UninitSlice was constructed from &mut [u8] - same issue why there was a bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right! Way subtle

@carllerche
Copy link
Member Author

cc @RalfJung this fix our unsoundness? :)

@carllerche
Copy link
Member Author

@Kixunil I applied feedback.

Copy link

@Kixunil Kixunil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be useful to also test that slices actually get overwritten when write_slice and write_byte re used as them failing to do so is UB. (Of course, those tests must be performed on UninitSlice constructed from &mut [u8].)

pub fn write_byte(&mut self, index: usize, byte: u8) {
assert!(index < self.len());

unsafe { self.as_mut_ptr().offset(index as isize).write(byte) };
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could in theory overflow, even if unlikely. I'd suggest writing self[index..].as_mut_ptr().write(byte).
That way assert above is not strictly needed but I'd keep it for documentation value. (The repeated check should be optimized-out.)

/// # Safety
///
/// The caller **must not** read from the referenced memory and **must not**
/// write uninitialized bytes to the slice either.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting emphasis on "uninitialized" might be nice?

let mut data = [b'b', b'a', b'r'];

let slice = unsafe { UninitSlice::from_raw_parts_mut(data.as_mut_ptr(), 3) };
slice.write_slice(b"a");

This comment was marked as resolved.

///
/// Returned by `BufMut::bytes_mut()`, the referenced byte slice may be
/// uninitialized. The wrapper provides safe access without introducing
/// undefined behavior.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to say what the concrete invariant/protocol for this type is, and in particular how it differs from MaybeUninit.

if I understand correctly, the key difference is that the contents may be uninitialized, but the API ensures that once they are initialized, they will never be de-initialized again. Is that correct?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's correct. Especially it allows constructing UninitSlice from already-initialized slice (&mut [u8]).

@RalfJung
Copy link
Contributor

RalfJung commented Oct 18, 2020

cc @RalfJung this fix our unsoundness? :)

I have not reviewed your entire BufMut API surface, it is a bit too big for that.^^ But this seems to plug the hole that is caused by MaybeUninit not protecting against "de-initialization".

src/buf/uninit_slice.rs Outdated Show resolved Hide resolved
///
/// assert_eq!(b"bar", &data[..]);
/// ```
pub fn write_slice(&mut self, src: &[u8]) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this allow src to be shorter than self? Alternatively, should it be named copy_from_slice to mirror the method from std?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I suspect allowing writes of shorter slices would be more practical for consumers but if we decide to have same-length-slice then copy_from_slice is a better name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename this to copy_from_slice.

src/buf/uninit_slice.rs Outdated Show resolved Hide resolved
src/buf/uninit_slice.rs Outdated Show resolved Hide resolved
@Darksonn
Copy link
Contributor

I compiled it with rustdoc and looked at the documentation. I didn't catch any soundness issues by looking at it.

///
/// let slice = unsafe { UninitSlice::from_raw_parts_mut(ptr, len) };
/// ```
pub unsafe fn from_raw_parts_mut<'a>(ptr: *mut u8, len: usize) -> &'a mut UninitSlice {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have a constructor that takes an &mut [u8]?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe (also) impl<'a> From<&'a mut [u8]> for &'a mut UninitSlice

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both sounds good to me. I think we can do that in a follow up PR.

carllerche and others added 3 commits October 19, 2020 15:22
Co-authored-by: Alice Ryhl <alice@ryhl.io>
Co-authored-by: Alice Ryhl <alice@ryhl.io>
Co-authored-by: Alice Ryhl <alice@ryhl.io>
@carllerche carllerche merged commit e0d8413 into master Oct 19, 2020
@carllerche carllerche deleted the uninit-slice branch October 19, 2020 22:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants