-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing ByteSlice and ByteSliceMut for Vec<u8> #1045
base: main
Are you sure you want to change the base?
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
9b595b1
to
b588db3
Compare
src/lib.rs
Outdated
unsafe impl ByteSlice for Vec<u8> {} | ||
|
||
#[cfg(any(feature = "alloc", test))] | ||
// SAFETY: This uses safe a method from stdlib. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The safety comment needs to prove that the safety conditions of SplitByteSlice
are totally satisfied. Unfortunately, I believe safety conditions are unsatisfiable in this case. It asks:
In particular, given
B: SplitByteSlice
andb: B
, ifb.deref()
returns a byte slice with addressaddr
and lengthlen
, then ifsplit <= len
,b.split_at(split)
will return(first, second)
such that:
first
's address isaddr
and its length issplit
second
's address isaddr + split
and its length islen - split
Your implementation does not satisfy the second condition, because the two Vec
s will almost certainly not be adjacent in memory.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add a bit of extra detail here: Your implementations of ByteSlice
and ByteSliceMut
are fine so long as you remove the implementation of SplitByteSlice
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think if we give up the restriction of only allowing pair of Self
to be returned from SplitByteSlice::split_at
, we can use a type similar to VecSlice
, from my example, and can split a Vec
for free. And then we can also implement SplitByteSlice
for VecSlice
and then we can split a Vec
however many times we want at zero cost. Is there a reason we would not want to remove this restriction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC this is still unsound - how do you prevent the CanDrop
variant from being dropped before the DontDrop
variant? If that happens, then the DontDrop
variant points to freed memory. I could imagine solving this with lifetimes, but you might need to make SplitByteSlice
even more complex to be able to carry those lifetimes (or maybe not? I'm genuinely not sure).
In the abstract I think this is an interesting idea, but it will take some work to rejigger the API to support it without adding a lot of complexity that other users (who aren't using Vec
) will have to know about.
Is there a reason that your use case doesn't allow either just operating on slices (e.g. borrowing the Vec
) or making a clone of the Vec
into a RefCell<[u8]>
(which we do support)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's a new thing I tried that is probably better than the previous implementation. The VecSlice
type now looks like the following
pub struct VecSlice {
slice: VirtualVec,
ghost: Rc<GhostVec>,
}
struct VirtualVec {
ptr: *mut u8,
len: usize,
cap: usize,
}
struct GhostVec(Vec<u8>);
With every split, new VirtualVec
s are created but the GhostVec
is shared by all the splits and will only drop when the last VecSlice
is dropped.
Although, now when I think about it, this is just a more sophisticated way of achieving what can easily be achieved by calling split_at
on Vec
not as Vec
but as slice
. This doesn't seem to provide much value I guess.
Regarding your question, I was just curious about if it was possible to split Vec
at zero cost.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshlf removed all the unnecessary code. It now only contains implementations of ByteSlice
and ByteSliceMut
645e425
to
a5fc43b
Compare
1fffa6a
to
a4d7277
Compare
@@ -5774,6 +5774,11 @@ unsafe impl<'a> SplitByteSlice for RefMut<'a, [u8]> { | |||
} | |||
} | |||
|
|||
// TODO(#429): Add a "SAFETY" comment and remove this `allow`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add safety comments here and below? Our policy is not to introduce any new undocumented unsafe
code while we burn down the existing TODOs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further reflection, I don't believe the safety conditions of ByteSlice
are satisfiable:
Safety
Implementations of
ByteSlice
must promise that their implementations ofDeref
andDerefMut
are "stable". In particular, givenB: ByteSlice
andb: B
,b
must always dereference to a byte slice with the same address and length. This is true for bothb.deref()
andb.deref_mut()
. IfB: Copy
orB: Clone
, then the same is also true of copies or clones ofb
. For example,b.deref_mut()
must return a byte slice with the same address and length asb.clone().deref()
.
For Vec
, a growable buffer, this cannot be the case: the length changes as elements are added and removed, and the address changes as elements are added.
@@ -5774,6 +5774,11 @@ unsafe impl<'a> SplitByteSlice for RefMut<'a, [u8]> { | |||
} | |||
} | |||
|
|||
// TODO(#429): Add a "SAFETY" comment and remove this `allow`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On further reflection, I don't believe the safety conditions of ByteSlice
are satisfiable:
Safety
Implementations of
ByteSlice
must promise that their implementations ofDeref
andDerefMut
are "stable". In particular, givenB: ByteSlice
andb: B
,b
must always dereference to a byte slice with the same address and length. This is true for bothb.deref()
andb.deref_mut()
. IfB: Copy
orB: Clone
, then the same is also true of copies or clones ofb
. For example,b.deref_mut()
must return a byte slice with the same address and length asb.clone().deref()
.
For Vec
, a growable buffer, this cannot be the case: the length changes as elements are added and removed, and the address changes as elements are added.
#[allow(clippy::undocumented_unsafe_blocks)] | ||
#[cfg(any(feature = "alloc", test))] | ||
unsafe impl ByteSlice for Vec<u8> {} | ||
|
||
// TODO(#429): Add a "SAFETY" comment and remove this `allow`. | ||
#[allow(clippy::undocumented_unsafe_blocks)] | ||
unsafe impl<'a> ByteSliceMut for &'a mut [u8] {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@joshlf, in light of the above comment, I'm not certain that this satisfies our documented safety condition either. It's trivial to define a function that changes both the length and address of a &mut [u8]
between calls to deref
:
fn change_add_and_length(mut slice: &mut [u8]) {
slice = &mut slice[1..];
}
I think this indicates that our safety documentation on ByteSlice
is missing some critical nuance.
Adresses #992