Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zeroize: Implement Zeroize for CString (#650) #759

Merged
merged 1 commit into from Apr 28, 2022

Conversation

robinhundt
Copy link
Contributor

Commit Msg:
This implements Zeroize for std::ffi::CString. As CString is not in
allocate but in std, a new optional std feature is added to depend
on the std library.

I'm not quite sure if my implementation is correct and secure. In the issue linked in #650, it was mentioned that it likely is not possible to implement Zeroize for CString, due to the requirement that there must be no null bytes in the buffer. One of the comments stated that this problem could maybe be solved in "hacky" way:

We could do something slightly wacky and overwrite internal bytes with 0xFF or something, so as to obliterate the data while preserving the "no internal NUL bytes" guarantee.

Originally posted by @cyberia-ng in iqlusioninc/crates#509 (comment)

However, I think that my implementation of:

impl Zeroize for CString {
    fn zeroize(&mut self) {
        let this = std::mem::take(self);
        this.into_bytes().zeroize();
    }
}

should be correct and secure. std::mem::take replaces replaces self with a CString::default. The original CString is then converted into a Vec<u8> (CString itself contains a Box<u8>), and zeroize is called on that. This should not result in a reallocation as into_bytes uses into_vec on the internal Box<u8> buffer.

I've also added a testcase similar to the one for String, but it doesn't actually test if the memory was zeroed, but only that the length of the internal buffer of the CString is zero after zeroize which is done by the mem::take. Any ideas on how to properly test this code?

If the code looks good, I'll also adapt the documentation to reflect the new impl and std feature :)

@tarcieri
Copy link
Member

tarcieri commented Apr 4, 2022

This looks good at first glance:

  • mem::take uses replace internally to swap the pointer
  • CString::into_bytes calls ::into_vec which takes ownership of the heap pointer as a Vec<u8>
  • Calling .zeroize() on the resulting vector should clear out the bytes

As for writing a test, it seems like you should be able to use CString::as_ptr to obtain the original pointer prior to calling .zeroize(). You can then read bytes from the pointer and ensure they're all zeroes (e.g. looping over the length of the original buffer, derefing a byte-at-a-time and checking that the derefed byte is zero).

That's technically a use-after-free, since the Vec<u8> will be dropped inside of the Zeroize impl. However, if the test succeeds even once it should validate the PoC. If the test turns out to be flaky due to the UAF, we can always remove it.

@robinhundt
Copy link
Contributor Author

As for writing a test, it seems like you should be able to use CString::as_ptr to obtain the original pointer prior to calling .zeroize(). You can then read bytes from the pointer and ensure they're all zeroes (e.g. looping over the length of the original buffer, derefing a byte-at-a-time and checking that the derefed byte is zero).

Unfortunately this didn't work. In my test runs, the memory always contained random values after the zeroize.
I stepped through the test with a debugger and confirmed that the memory is actually zeroed by the zeroize call. It seems that something else is allocated in that memory after it is zeroed (at least on my system). I've adapted the test such that it tests if the memory is different than before. This way we at least can be sure that the original data is not there anymore. At least as long as the compiler doesn't optimize the test to something unexpected due to it being UB 😅

I also adapted the documentation.

Comment on lines 66 to 69
//! With the `std` feature enabled (which it is **not** by default), [`Zeroize`]
//! is also implemented for [`CString`]. Due to its requirement of containing no
//! null bytes, calling `zeroize()` on a `CString` will drop it after zeroing the
//! memory.
Copy link
Member

@tarcieri tarcieri Apr 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there's something a bit tricky and nuanced about the current implementation which is good to capture, but this doesn't quite describe it.

Namely it doesn't actually Drop the CString, but the pointer changes after calling zeroize() and the original boxed slice is dropped. This may be surprising to CString users.

I'm wondering if it might actually make sense to change the implementation to convert the taken Vec<u8> back ::into_boxed_slice after zeroization so that the original pointer remains valid for the lifetime of the CString.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I wasn't to sure about that part as well. I started with a more detailed explanation, but then wasn't sure if these implementation details should be part of the documentation.

I like the idea of putting the Vec back into the CString. I'll adapt the code to do this.

Comment on lines 901 to 903
// DEFINITELY UB! We reconstruct the original buffer which should be deallocated after
// the .zeroize()
let orig_buf = std::slice::from_raw_parts(orig_ptr, orig_len);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of trying to construct a slice out of the UAF'd memory, you can do pointer reads/derefs instead. That at least avoids instantiating something which violates the Rust memory model.

I'd really like to see a check for all zeroes succeed at least once. This test tells you little: the memory may be different simply because it's UAF and has been overwritten.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: saw your other comment. Might be the other tests running in parallel in other threads causing heap allocations. Unfortunately this code is quite difficult to test.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used read_volatile instead of a deref as otherwise only the first byte was checked when testing in debug mode.

When compiling with --release apparently the whole loop is optimized away, even with read_volatile (I must admit, I'm not a 100% clear on what exactly read_volatile ensures).

@robinhundt robinhundt force-pushed the master branch 2 times, most recently from 73397a1 to ba96cfc Compare April 5, 2022 17:08
@robinhundt
Copy link
Contributor Author

Okay, I think my latest version is working and maintains validity of the pointer to the internal buffer.

I've also added the std feature to the CI test configuration.

Btw. is it better to amend changes and force push my branch so it remains one commit which can be easily merged, or is it better to add many small WIP commits which can then be squashed before merging?
(Writing it out, it is probably option 2 😅)

@robinhundt
Copy link
Contributor Author

robinhundt commented Apr 5, 2022

Ahh, that's unfortunate. I was using from_vec_with_nul_unchecked which has only been stabilized in 1.58.
I'm not sure if using from_vec_unchecked is an option, since it calls v.reserve_exact(1) on the Vec which might result in a reallocation.
Using, CString::from_raw is also not an option, since the CString must have been created from into_raw and its length must not have been modified.

Edit: Oh, I misread the documentation of reserve_exact.

After calling reserve_exact, capacity will be greater than or equal to self.len() + additional. Does nothing if the capacity is already sufficient.

Since the capacity is at least 1 and the length is 0, the capacity is sufficient and no reallocation should happen.

@tarcieri
Copy link
Member

tarcieri commented Apr 5, 2022

Ahh, that's unfortunate. I was using from_vec_with_nul_unchecked which has only been stabilized in 1.58.

You shouldn't need to use any of that. After zeroization, just Vec::truncate it to length 0 actually zeroize should automatically truncate it to length 0, so just zeroize it.

After that, use CString::new. That will push a null byte, but that shouldn't cause reallocation due to its previous capacity, which is at least 1 to hold the previous null byte.

This implements Zeroize for std::ffi::CString. As CString is not in
allocate but in std, a new optional std feature is added to depend
on the std library.
@robinhundt
Copy link
Contributor Author

@tarcieri I think this might be ready to merge now.
The zeroize impl now simply uses CString::new. By changing to new().expect() I also uncovered a subtle bug in the implementation. When only the std feature was enabled but not the alloc one, the buf.zeroize() call resolved to the implementation for a slice, which doesn't clear the Vec. In this case, the buffer contained internal null bytes which is not allowed. I've resolved the issue by adding the alloc feature as a dependency of the std feature, this makes sense anyway.

@tarcieri
Copy link
Member

Nice, looks good now.

Thank you!

@tarcieri tarcieri merged commit a729d1f into RustCrypto:master Apr 28, 2022
@tarcieri tarcieri mentioned this pull request May 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants