Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zeroize: potetial redesign ideas #1046

Open
newpavlov opened this issue Jan 22, 2024 · 30 comments
Open

zeroize: potetial redesign ideas #1046

newpavlov opened this issue Jan 22, 2024 · 30 comments

Comments

@newpavlov
Copy link
Member

newpavlov commented Jan 22, 2024

Right now I see the following issues with public API of zeroize:

  • Derived and manually implemented Zeroize can be inefficient since they rely on zeroization of fields one by one or on zeroizing Drop impls.
  • Procedural macros are relatively heavy compile time-wise, so usually we do not derive Zeroize ad rely on manual implementations.
  • ZeroizeOnDrop is a somewhat useless trait, I haven't seen it used in practice.
  • Zeroizing usefulness is limited. It can be used only for "primitive" types (e.g. raw keys), complex types do not implement Zeroize and instead implement zeroizing Drop.
  • Zeroization is enabled for a whole crate using features. It's not possible to use selective zeroization and zeroization is not reflected in any way in user code.

I think most structures should be zeroized using the "flat" zeroization (see #1045) and that it can be useful to have explicit indication in user code of structs being zeroized on drop.

So I would like to suggest roughly this API:

/// Zeroizes memory pointed by `data_ptr`, but not memory
/// potentially reference by `T`.
pub unsafe fn zeroize_flat<T>(data_ptr: *mut T) {
    // ...
}

/// Zeroize-on-drop wrapper.
///
/// Note that this wrapper zeroizes only data owned by `T`
/// and does nothing with data referenced by it.
#[repr(transparent)]
pub struct ZeroizeOnDrop<T, const IS_ENABLED: bool = true>(T);

/// Abbreviated alias for `ZeroizeOnDrop`. 
pub type Zod<T, const IS_ENABLED: bool = true> = ZeroizeOnDrop<T, IS_ENABLED>;

impl<T, const IS_ENABLED: bool> Drop for ZeroizeOnDrop<T, IS_ENABLED> {
    fn drop (&mut self) {
        unsafe {
            std::ptr::drop_in_place(&mut self.0);
            if IS_ENABLED { zeroize_flat(&mut self.0); }
        }
        
    }
}

// Impl Deref and DerefMut for `ZeroizeOnDrop`

UPD: FlatPod and FlatZod are removed.

It can be introduced in a backward-compatible way, but for clarity it's probably worth to release it as v2.0.

Users would write code like this:

pub struct Foo {
    secret_cipher: Zod<Aes128>,
    secret_key: Box<Zod<[u8; 16]>>,
    non_secret_hasher: Sha256,
    // other fields
}

const ZOD: bool = cfg!(feature = "zeroize");

pub struct Bar1 {
    // This field will be zeroized only if `zeroize` feature is enabled
    cipher: Zod<Aes128, ZOD>,
}

// Alternatively:
pub struct Bar2<const ZOD: bool = true> {
    cipher: Zod<Aes128, ZOD>,
}

While it will be a bit less convinient, I think it's useful to have explicit indication in source code that secret types will be zeroized on drop. We may provide aliases like type ZodAes128 = Zod<Aes128> to improve visibility, but I don't think they are worth the trouble and it should be sufficient to simply references zeroize in docs.

@daxpedda
Copy link
Contributor

daxpedda commented Jan 22, 2024

  • Procedural macros are relatively heavy compile time-wise, so usually we do not derive Zeroize ad rely on manual implementations.
  • ZeroizeOnDrop is a somewhat useless trait, I haven't seen it used in practice.

The point of ZeroizeOnDrop is to run zeroize() on Drop, which is a very common thing to do with the current design of zeroize. A quick search on GitHub (or in its old Zeroize(drop) form) shows that its widely used.

As long as Zeroize exists, ZeroizeOnDrop seems to make a lot of sense to me.

@newpavlov
Copy link
Member Author

newpavlov commented Jan 22, 2024

The point of ZeroizeOnDrop is to run zeroize() on Drop

No, it's not. zeroize::ZeroizeOnDrop is nothing more than a marker trait which indicates (but does not guarantee in any way) that type has a zeroizing Drop impl. Most of "uses" which your search query has found are simply implementation of this trait, not uses which influence anything practical. On first 5 pages I found only 2 uses which rely on it for bounds.

@daxpedda
Copy link
Contributor

The point of ZeroizeOnDrop is to run zeroize() on Drop

No, it's not. zeroize::ZeroizeOnDrop is nothing more than a marker trait which indicates (but does not guarantee in any way) that type has a zeroizing Drop impl. Most of "uses" which your search query has found are simply implementation of this trait, not uses which influence anything practical. On first 5 pages I found only 2 uses which rely on it for bounds.

I was talking about the proc-macro, not the trait itself. I guess your original point then was about the trait not the proc-macro?

Unfortunately I don't remember the motivation behind the ZeroizeOnDrop trait anymore, but I'm happy to dig into it again when we get there.

@newpavlov
Copy link
Member Author

I guess your original point then was about the trait not the proc-macro?

Yes. As I wrote in the OP, in RustCrypto we usually do not rely on proc macro and instead manually implement Drop and ZeroizeOnDrop.

@tarcieri
Copy link
Member

I would probably suggest against any breaking changes to zeroize.

Due to the trait-based nature of its API, which is used across crates, it makes breaking changes extremely difficult due to the requirement for coordinated upgrades.

I find names like FlatPod and Zod to be rather confusing and jargony. "Pod" doesn't intuitively imply "plain old data" to me but rather some sort of container like a "cell". At least the existing traits/structs have "Zeroize" in the name to make it clear what they do.

DefaultIsZeros already seems to cover many of the cases you have in mind for this.

I think #1045 is fine, but should be done in an additive manner.

@tarcieri
Copy link
Member

dsa and rsa are some examples of where data isn't flat. Keys have a single secret field, but the type which stores them is a newtype for e.g. Box/Vec like num_bigint_dig::BigUint or crypto_bigint::BoxedUint. We don't want to just blanket add a Drop impl to these types because not all of them are secrets and it would negatively impact performance. The knowledge of which specific fields are secrets exists in the Drop impls in dsa/rsa.

I also think it would be good to be able to silently and transparently promote secret storage for other types to be heap-backed as well when available when the alloc feature is available, so as to better avoid making copies on moves.

@newpavlov
Copy link
Member Author

newpavlov commented Jan 22, 2024

Due to the trait-based nature of its API, which is used across crates, it makes breaking changes extremely difficult due to the requirement for coordinated upgrades.

For most of the RustCrypto crates upgrade should be quite painless since they do not provide Zeroize impls and instead only have zeroizing Drop and ZeroizeOnDrop. As argued above, the latter is not used much in practice for trait boudns, so it can be removed with minimal consequences. The biggest breaking change will be removal of zeroize crate features.

I find names like FlatPod and Zod to be rather confusing and jargony.

I do not insist on this exact names, think of them as of placehoders. FlatPod probably should not be introduced in the first place, since it's only advisory and its usefulness is extremely limited. As for Zod, I used short name only for brevity. We probably should name it fully as ZeroizeOnDrop with additional type Zod<T> = ZeroizeOnDrop<T>; alias.

DefaultIsZeros already seems to cover many of the cases you have in mind for this.

But it should not be implemented for types like Aes128. The main idea is that Zod will perform indiscriminate zeroization without any need for support in crates which introduce "flat" types (so most of our crates).

Keys have a single secret field, but the type which stores them is a newtype for e.g. Box/Vec like num_bigint_dig::BigUint or crypto_bigint::BoxedUint.

Such types can have ZOD constant as in the example above, but with default set to false. And we can define type BoxedSecretUint = BoxedUint<true>; alias which will be used for secret keys.

@tarcieri
Copy link
Member

For most of RustCrypto crates upgrade should be quite painless since they do not provide Zeroize impls and instead only have zeroizing Drop and ZeroizeOnDrop. As argued above, the latter is not used much in practice for trait boudns, so it can be removed with minimal consequences. The biggest breaking change will be removal of zeroize crate features.

You're completely redesigning the API, so this will be a massive breaking change (and I'm not sure I'm particularly pleased with the new design).

Unlike our other trait crates where we generally control all the downstream trait impls, Zeroize and ZeroizeOnDrop impls exist in downstream crates we don't control, so the blast radius of any such change is significantly larger.

Also if we're thinking about a zeroize redesign, I'm not sure this is solving the right problems. There are many highlighted in this paper:

https://eprint.iacr.org/2023/1713.pdf

@newpavlov
Copy link
Member Author

I have updated draft in the OP. FlatPod is removed and Zod is now a type alias.

Unlike our other trait crates where we generally control all the downstream trait impls, Zeroize and ZeroizeOnDrop impls exist in downstream crates we don't control, so the blast radius of any such change is significantly larger.

I think you overestimate potential consequences. Downstream crates which derive Zeroize/ZeroizeOnDrop on their structs will continue to work without any changes. If they decide to move to zeroize v2.0, they would need to decide where Zod should be applied.

Users may have issues if they include our type which implement Zeroize, but there is just several such types in our organization and some of those are private types.

Another potential issue is that users may rely on Drop zeroization, but with the new API they would need to use Zod, so without reading changelogs they may not notice this. I think it will be somewhat addressed by removal of zeroize features from new releases. Users will inevitable notice it during migration to new breaking releases and likely will look into changelogs to learn more about the change.

Also if we're thinking about a zeroize redesign, I'm not sure this is solving the right problems. There are many highlighted in this paper:

I've skimmed the paper and I am not sure it's relevant for deciding how public API of zeroize should look like. IIUC it's mostly concerned about stack bleaching and how to reliably implement zeroization to prevent undesirable compiler optimizations. I will read it more carefully a bit later.

@tarcieri
Copy link
Member

I think you overestimate potential consequences. Downstream crates which derive Zeroize/ZeroizeOnDrop on their structs will continue to work without any changes.

There are a lot of uses which aren't custom derive. I suggest looking through this:

https://github.com/search?q=%22impl+zeroize%22&type=code

@newpavlov
Copy link
Member Author

There are a lot of uses which aren't custom derive. I suggest looking through this:

It does not matter whether it's derive or manual implementations. The uses will continue to work until maintainers will decide to migrate to zeroize v2.0. Even if an upstream crate has migrated to v2.0, downstream may delay full migration to v2.0 by simply wrapping upstream types by Zod in their structs.

@tarcieri
Copy link
Member

The problem is when you have cross-crate interdependencies, especially when there are multiple different vendors of crates at various levels, e.g.:

protocol implementation -> elliptic curve library -> numerical library

Breaking changes on trait-based APIs are incredibly intrusive in these sorts of environments. It's why serde can't bump to 2.0 to fix const generics support. The error messages for traits wit the same name across breaking releases of the same crate can be quite inscrutible.

I guess there's the semver trick to work around this, but with this sort of API, I think the amount of downstream churn and toil needs to be carefully weighed against the benefits that breaking changes might bring.

@tarcieri
Copy link
Member

Yet another approach is to simply make a new library with a different name which provides the same functionality / end goals, which can be vetted independently, and if it seems like a better approach, you can deprecate the old one.

This is what I've explored with e.g. cmov versus subtle, where subtle is also in the position of being the sort of trait-based crate with complex interrelationships across crates with multiple vendors that make it difficult to make breaking changes (and we need breaking changes for e.g. heap-backed big integers).

@newpavlov
Copy link
Member Author

The error messages for traits wit the same name across breaking releases of the same crate can be quite inscrutible.

Note that the proposed re-design does not have Zeroize and ZeroizeOnDrop traits at all, so users should not encounter this issue.

Yet another approach is to simply make a new library with a different name which provides the same functionality / end goals, which can be vetted independently, and if it seems like a better approach, you can deprecate the old one.

Yes, it's a viable option. The only question is whether the RustCrypto crates should keep zeroize support in future breaking releases or not, if the new crate will prove itself a good solution. One of motivation factors for the proposed redesign is desire to remove a lot of zeroize boilerplate from our crates.

@tarcieri
Copy link
Member

One of motivation factors for the proposed redesign is desire to remove a lot of zeroize boilerplate from our crates.

I think we can make progress there without breaking changes. #1045 is a good start.

@tarcieri
Copy link
Member

tarcieri commented Jan 22, 2024

Zeroizing usefulness is limited. It can be used only for "primitive" types (e.g. raw keys), complex types do not implement Zeroize and instead implement zeroizing Drop.

This isn't quite true. BoxedUint impls Zeroize, for example.

Really it comes down to whether a type optionally contains secrets, or always contains secrets.

You are adding a boolean flag to your ZeroizeOnDrop type (which has an obnoxiously long name, which is why I went with Zeroizing, and yours has a confusing acronym Zod to reduce the length of the name) to capture that same difference.

Zeroizing<T> expresses the same thing as Zod<T, true>.

But what is the utility of Zod<T, false>? Why not just use T? What is the point of being generic over the secrecy? It feels like a wrapper type in that context is just getting in the way.

Generally how code is structured with zeroize now, there's an outer newtype which imposes some variants on the inner type which specifically represents secret values and also takes care of zeroizing the inner type, using Zeroizing if so desired.

@newpavlov
Copy link
Member Author

newpavlov commented Jan 22, 2024

I think we can make progress there without breaking changes. #1045 is a good start.

Yes. I do not plan to implement the redesign in the near future, so the linked PR should be merged regardless of this discussion.

Zeroizing expresses the same thing as Zod<T, true>.

Not quite. The difference is that Zod works on any type (e.g. Ctr<Aes128>), while Zeroizing requires T: Zeroize and it's often not correct to implement Zeroize for complex types.

The biggest concern with Zod is that users can not verify "flatness" of T without inspecting source code and thus external owned data may be left unerased.

But what is the utility of Zod<T, false>? Why not just use T?

Because code like this will be much more annoying to write and work with:

struct AeCipher {
    #[cfg(not(feature = "zeroize"))]
    cipher: Ctr<Aes128>,
    #[cfg(not(feature = "zeroize"))]
    hasher: Hmac<Sha256>,
    #[cfg(feature = "zeroize")]
    cipher: Zod<Ctr<Aes128>>,
    #[cfg(feature = "zeroize")]
    hasher: Zod<Hmac<Sha256>>,
}

But I think in most cases users will not use the optionality.

@tarcieri
Copy link
Member

tarcieri commented Jan 22, 2024

No code is written like that today.

Those types all take care of zeroizing themselves, and the drop impls are gated on the zeroize feature.

Zod can't help there, because it's defined in the zeroize crate.

@newpavlov
Copy link
Member Author

newpavlov commented Jan 22, 2024

No code is written like that today.

I thought you asked about the proposed API.

Those types all take care of zeroizing themselves, and the drop impls are gated on the feature.

The main point of the proposal is to move from this model and instead make users to explicitly mark zeroized data in their code using Zod instead of relying on more opaque crate features, which are easier to accidentally disable (you will not even get a compilation error in many cases) and which do not provide any optionality. For example, you may want to zeroize Hmac<Sha256>, but not Sha256 used elsewhere. Today you have no choice but to enable zeroize for the whole sha2 crate.

@tarcieri
Copy link
Member

tarcieri commented Jan 22, 2024

To do that, add a Zeroize impl on Sha256 instead of a Drop impl, then Hmac<Sha256> can call zeroize on Sha256 on drop. That's what I was saying earlier:

Really it comes down to whether a type optionally contains secrets, or always contains secrets.

And if Sha256 has a Zeroize impl, you can use Zeroizing<Sha256> to express it contains secrets you want cleared on drop.

@newpavlov
Copy link
Member Author

newpavlov commented Jan 22, 2024

To do that, add a Zeroize impl on Sha256 instead of a Drop imp

No, in my opinion, you can not. I think it's incorrect to implement Zeroize for complex types like Sha256 and Aes128, since zeroized state may not be a valid value for it (imagine it contains a NonZero* field) and we can not guarantee that zeroize() will not be called outside of Drop. And even if a zeroized state is valid (for many cases it's true), accidental zeroization of a cryptographic struct may be a security hazard. As you can recall, we intentionally do not implement Zeroize for such types.

Really it comes down to whether a type optionally contains secrets, or always contains secrets.

Secrecy is often dependent on how this type is used and this information is available only to downstream users. You can not know it just by type beforehand. For example, the same ChaCha8 type can be used for encrypting network communications and for running user-space PRNG for security insensitive game simulation. And it can happen in one application.

@tarcieri
Copy link
Member

No, in my opinion, you can not. I think it's incorrect to implement Zeroize for complex types like Sha256 and Aes128, since zeroized state may not be a valid value for it

impl DefaultIsZeros for Sha256 {} seems like a reasonable solution to me.

For Aes128, you are definitionally always dealing with secrets, so I don't think it makes sense to have a "non-secret mode" there which skips zeroization even if the zeroize crate feature is enabled.

@newpavlov
Copy link
Member Author

For Aes128, you are definitionally always dealing with secrets, so I don't think it makes sense to have a "non-secret mode" there which skips zeroization even if the zeroize crate feature is enabled.

For example, non secret mode can be useful for ephemeral cipher instances stored on stack which get erased by stack bleaching after necessary computations with the cipher are completed. The PRNG example is also applicable (after all, with AES-NI Ctr<Aes128> can be faster than ChaCha8).

impl DefaultIsZeros for Sha256 {} seems like a reasonable solution to me.

How? Such impl directly contradicts the trait docs:

Marker trait for types whose Default is the desired zeroization result

@tarcieri
Copy link
Member

You explicitly mentioned not wanting to leave Sha256 in a bad state after zeroization. Using DefaultIsZeros makes zeroize() synonymous with reset().

@tarcieri
Copy link
Member

For example, non secret mode can be useful for ephemeral cipher instances stored on stack which get erased by stack bleaching after necessary computations with the cipher are completed.

If you're using stack bleaching, you can just leave the zeroize feature off.

@newpavlov
Copy link
Member Author

Using DefaultIsZeros makes zeroize() synonymous with reset().

Ah, I was confused by the name. I thought the marker trait indicates that Default::default() returns state filled with zeros, not that "zeroization" creates a default value. But there are still issues. Firstly, DefaultIsZeroes is currently bounded by Copy, so it can not be applied to Sha256. Secondly, block-buffer may in future use uninitialized memory for default state and I am not sure it will interact properly with blanket implementation of Zeroize for DefaultIsZeroes.

If you're using stack bleaching, you can just leave the zeroize feature off.

Using stack bleaching in one part of application does not mean I don't have long-lived ciphers which should be zeroized on drop in other part. As I wrote earlier, necessity of zeroization on drop is often dependent on how we use a type.

@tarcieri
Copy link
Member

tarcieri commented Jan 22, 2024

Firstly, DefaultIsZeroes is currently bounded by Copy, so it can not be applied to Sha256. Secondly, block-buffer may in future use uninitialized memory for default state and I am not sure it will interact properly with blanket implementation of Zeroize for DefaultIsZeroes.

Okay fine, it could be written using your zeroize_flat_type function followed by a *self = Self::default().

Using stack bleaching in one part of application does not mean I don't have long-lived ciphers which should be zeroized on drop in other part. As I wrote earlier, necessity of zeroization on drop is often dependent on how we use a type.

Trying to micro-optimize the cases where sometimes you want to use zeroize using the zeroize crate and sometimes you want to use something else external to handle zeroization instead is not something else anyone has ever requested. We've generally received few complaints about performance as is, despite it not being remotely close to optimal.

Perhaps before we go down that road we can explore using zeroize_flat_type?

@nstilt1
Copy link
Contributor

nstilt1 commented Jan 22, 2024

It could be a little detrimental to make zeroize_flat_type() public with default features since the best outcome of using it is quite appealing. By making it public, a downstream user (e.g. someone using ChaCha20Rng) would be able to call that function on the rng even though it would already impl ZeroizeOnDrop.

A note about that in the documentation would help, but I think it might be beneficial to gate it behind a feature to prevent downstream users from misusing it on something that already implements ZeroizeOnDrop or on something that isn't "flat". The proposed redesign seems like it would restrict users from doing this, but a pub unsafe fn with some slightly spooky docs, gated behind a feature like hazmat or library-dev (or something) might be enough of a deterrent.

But it would be a little more ideal if it were possible to make an impl_zeroize_flat_on_drop macro that could be called on a type with generic parameters. I tried to do that, but it was not pretty, and it did not work. It might be more possible with only the drop method being defined in the macro instead of actually implementing impl<...> Drop for X<...> or impl Drop for X.

@tarcieri
Copy link
Member

@nstilt1 it's unsafe which should hopefully make people pause and read the instructions before considering it (famous last words, I know)

@ok-john
Copy link

ok-john commented Apr 23, 2024

Regarding the usage of zeroize, it may be valuable to borrow theory from the fast key erasure of the CSPRNG in the linux kernel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants