Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain why one use of unsafe is sound #885

Closed
wants to merge 1 commit into from

Conversation

joshlf
Copy link

@joshlf joshlf commented Mar 23, 2019

No description provided.

@philipc
Copy link
Contributor

philipc commented Mar 23, 2019

I don't think the unsafety here is about the overflow. It's about transmuting a Vec<Rgb<u8>> into Vec<u8>. I'm not an expert, but I don't think this is actually sound. See nabijaczleweli/safe-transmute-rs#16 (comment).

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

I wonder if it is technically possible to tag Rgb with some other repr to make this safe. But I suppose even that is iffy, the best way would be to avoid allocating the inital wrong Vec entirely but that is likely a concern for many other decoders as well. Edit: Also, completely agree current use is not as sound as it should be...

@joshlf
Copy link
Author

joshlf commented Mar 23, 2019

Rgb is defined as #[repr(C)] in color.rs. See the define_colors macro. It's still pretty sketchy, but I'm not convinced it's unsound.

@joshlf
Copy link
Author

joshlf commented Mar 23, 2019

I don't think the unsafety here is about the overflow. It's about transmuting a Vec<Rgb> into Vec.

Critically, this doesn't actually use transmute; it uses from_raw_parts.

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

Critically, this doesn't actually use transmute; it uses from_raw_parts.

That requires the pointer to have been allocated by a proper corresponding call to Vec::_ at some point. Critical parts are the points mentioned in the documentation, but I'm not even quite sure if it safe to rely on those preconditions never changing as part of semver compatibility of Rust.

  1. is fulfilled
  2. is already violated, even with repr(C): size_of::<Rgb<u8>> == 3 but size_of::<u8> == 1
  3. and 4. both are not applicable due to the size difference.

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

To actually make this completely memory unsafe, just imagine an allocator pooling by size of the objects. Then suddenly the deallocation refers to the pool of a different size than the allocation originally and UB is definitely not out-of the picture and definitely caused by that from_raw_parts. (Fortunately jemalloc pools by total allocation size and alignment only, I think).

@joshlf
Copy link
Author

joshlf commented Mar 23, 2019

To actually make this completely memory unsafe, just imagine an allocator pooling by size of the objects. Then suddenly the deallocation refers to the pool of a different size than the allocation originally and UB is definitely not out-of the picture and definitely caused by that from_raw_parts.

The length of the vector (in bytes) hasn't actually changed, though. The deallocation would be for the same number of bytes as the original allocation.

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

@joshlf Same number of bytes but different size. If the precondition of an unsafe block says: T must have the same size as the inital object used for allocation, I find it alarming to argue that it is fine to violate this, only because we 'know' the underlying allocator will not care in the end. In other words, this may not trigger UB in any current execution, but it is definitely not safe.

@joshlf
Copy link
Author

joshlf commented Mar 23, 2019

Same number of bytes but different size. If the precondition of an unsafe block says: T must have the same size as the inital object used for allocation, I find it alarming to argue that it is fine to violate this, only because we 'know' the underlying allocator will not care in the end.

Oh sorry, I didn't realize that you were referring to the Vec::from_raw_parts documentation. Now I see what you're saying, and I 100% agree, this is totally unsound.

@fintelia
Copy link
Contributor

Yeah, it would be nice to not have any technically unsound code in the crate, even if it never currently triggers issues. Personally, I'd like to push towards completely eliminating unsafe code in the library so we don't have to debate soundness in the first place, although I understand that any significant performance regressions could cause issues for some users.

Also, this isn't the only place where this sort of unsoundness happens in the library. We also depend on the safe_transmute crate which does the same thing

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

I'd like to think this is fine but then that crate seems to not check the size of T either or am I missing the check in its call chain (and the documentation also alludes to the opposite, e.g. here)? Seems that nabijaczleweli/safe-transmute-rs#16 this issue tracks all that but I really don't see how the crate helps us here if such fundamental parts of the definitely specified documentation are silently dropped.

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

At this point I consider removing the dependency from image because it is surely one of its biggest contributors to their popularity ranking on crates.io and I just don't feel like endorsing it in the current state? I'm sorry if this feels like a rather harsh reaction but UB like that is a red rag for me especially for a crate stating it tries to make this safe and that feels like a huge broken promise.

@HeroicKatora
Copy link
Member

We use it for a single line that exhibits exactly the same kind of unsafety discussed here....

@fintelia
Copy link
Contributor

I didn't know about these soundness issues when I added that dependency. I'd be in favor of removing it, and then looking for how to remove the unsoundness altogether.

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

@fintelia I realize, I totally understand your motivation and didn't want to assign any blame here ❤️ (and I'm thankful for all other parts of that related PR :) as well). Afterall, the documentation of that crate actively encourages this use, even after discovering its soundness hole..

@joshlf
Copy link
Author

joshlf commented Mar 23, 2019

I suspect that some of these instances could be replaced by uses of the zerocopy crate (disclaimer: I'm the author). I could mentor somebody in that or throw up a PR if you'd like.

@HeroicKatora
Copy link
Member

The first preliminary fix is here: HeroicKatora@e6e37c0

If that seems good, I'll apply the same strategy to the other places where I find an occurance of this. Feel free to note these occurances in the comments.

HeroicKatora added a commit to HeroicKatora/image that referenced this pull request Mar 23, 2019
Replaces image-rs#885, or rather was discovered during it as it was originally
trying to reason abouts its safety.
@fintelia
Copy link
Contributor

@HeroicKatora that fix looks reasonable. For the HDRDecoder case, I think we'd be better off just changing the signature of read_image_transform so that it returned a Vec<u8> (by having f return a slice/tuple and then "flat mapping" over it).

@philipc
Copy link
Contributor

philipc commented Mar 23, 2019

@HeroicKatora What's the codegen on that like? Does it allocate the entire vec up front? I think we could do better. There's nothing wrong with transforming the slice in this case (eg using as_bytes from zerocopy), so it could be as simple as v.as_bytes().to_vec().

Avoiding the copy would be nice, and might be doable by allocating once we know the size but before decoding pixels, but that's a much larger change.

@joshlf Nice crate, but it currently requires nightly: #![feature(refcell_map_split)]. Also the repository link isn't helpful in finding the source code for it.

@HeroicKatora
Copy link
Member

HeroicKatora commented Mar 23, 2019

@philipc It seemingly does not and I am highly disappointed in llvm 😄 However, for this loop variant I don't even see an alloc appearing, only realloc and dealloc?(Edit: Somehow pasted the wrong link, and now I can't find it anymore. Any reproduction produces an alloc, maybe I was mistaken). Am I interpreting this correctly? Anyways, the loop is much nicer, so let's not go functional.

@HeroicKatora
Copy link
Member

Or rather, I'll simply extract this to its own method so that we can adjust and reuse as we need. And quickly replace should a transmuted route ever become safe.

@joshlf
Copy link
Author

joshlf commented Mar 23, 2019

@joshlf Nice crate, but it currently requires nightly: #![feature(refcell_map_split)].

That feature was stabilized recently, so it should support stable as of the next stable release.

Also the repository link isn't helpful in finding the source code for it.

Sorry, I recently published a new version. docs.rs has caught up now.

@HeroicKatora
Copy link
Member

@joshlf I was already confused but finally docs.rs contains impl<T: AsBytes> AsBytes for [T] and AsBytes::as_bytes. Hm, could be helpful indeed, although currently we operate on owned Vec<_> instead of references to-be-cloned. #[derive(AsBytes)] on our color structs could be nice nevertheless, mind taking this to a separate PR?

HeroicKatora added a commit that referenced this pull request Mar 23, 2019
@HeroicKatora
Copy link
Member

Closing this as outdated

HeroicKatora added a commit that referenced this pull request Apr 13, 2019
Replaces #885, or rather was discovered during it as it was originally
trying to reason abouts its safety.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants