Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conditional Security Risk: to_vec() may retain sensitive data in memory on shared systems #1093

Open
nstilt1 opened this issue Dec 27, 2023 · 0 comments

Comments

@nstilt1
Copy link

nstilt1 commented Dec 27, 2023

The issue

This issue stems from to_vec() and any other function that pre-allocates a vector or string with a size that does not always have enough capacity for the data it will hold.

/// Serialize the given data structure as a JSON byte vector.
///
/// # Errors
///
/// Serialization can fail if `T`'s implementation of `Serialize` decides to
/// fail, or if `T` contains a map with non-string keys.
#[inline]
pub fn to_vec<T>(value: &T) -> Result<Vec<u8>>
where
    T: ?Sized + Serialize,
{
    let mut writer = Vec::with_capacity(128);
    tri!(to_writer(&mut writer, value));
    Ok(writer)
}

https://github.com/serde-rs/json/blob/05196caf16456cfe6e738e946f459691c5fbf0c6/src/ser.rs#L2168C1-L2182C2

If the JSON string/vec is larger than 128 bytes, it will reallocate to a larger capacity. When it reallocates, it can leave some data behind in memory. If a user is intending to encrypt the JSON string and then zeroize the original data, it could be ineffective due to there being an unknown amount of partial-copies of the JSON string (plaintext) in memory.

Examples of Use Cases that could result in a vulnerability

If a developer was planning on using serde_json to serialize some PHI or other sensitive data, this could result in a CWE-226 type vulnerability. Given the amount of downloads this crate has, it would not surprise me if there were people using (or mis-using) the library in this manner.

There is also another use case that could cause a vulnerability. Suppose a user wants to serialize a CSPRNG that they are using for cryptography. There is at least one CSPRNG that supports serde_json, which is rand_chacha, and their test case places the seed at the beginning of the string, and the whole string from the test case is around 140 bytes, which could result in the seed being exposed in memory even if rand_chacha were to support zeroize. Keep in mind, rand declares that their library shouldn't be used in scenarios where secrecy is a top priority, but it is probably common for users to misuse it.
https://github.com/rust-random/rand/blob/1db3aa416ce4bc0e4792a0e6aa98f009ef483604/rand_chacha/src/chacha.rs#L390

Alternative Solutions

This issue doesn't need to be fixed if it cuts into the performance too much, or if it would require a tedious fix, but at the very least, a warning could serve 2 beneficial purposes:

  1. inform developers to refrain from using to_vec() in conjunction with zeroize
  2. provide a little hint to newer developers about zeroize, subtly encouraging them to determine whether they should use it or not. I wish someone would have gotten me to look into zeroize sooner.

I have found an alternative Serialization/Deserialization library that appears to allocate more precisely, which would be more suitable for cryptographic use cases. The library is called prost which deals with Protocol Buffers, and their README mentions that they have looked into implementing a Serializer compatible with serde, but haven't had much luck so far.
https://github.com/tokio-rs/prost?tab=readme-ov-file#faq

@nstilt1 nstilt1 changed the title Security Vulnerability: to_vec() and similar functions can leave some partial-copies in memory Conditional Security Risk: to_vec() may retain sensitive data in memory on shared systems Jan 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant