Why does Zeroize use a fence? #988

oconnor663 · 2023-12-01T22:15:04Z

The docs mention that the fence prevents reordering "as a precaution to help ensure reads are not reordered before memory has been zeroed". My understanding is that the compiler and the hardware are already obligated to make sure that a thread reads its own writes, so reorderings are only possible from the perspective of other threads (or interrupts). But zeroize is using a compiler fence rather than a full atomic fence, so it doesn't seem like we're worried about other threads here. Am I missing some other scenario? What reorderings could bite us here without the fence?

The text was updated successfully, but these errors were encountered:

tarcieri · 2023-12-01T22:21:01Z

I believe when it was originally added the concerns were things like interruption by signal handlers, though now I can't find the previous discussion

newpavlov · 2024-01-22T00:57:51Z

I also think that the fences are not necessary. With the current crate MSRV we may even replace the volatile writes with simple writes followed by the following "black box" asm!:

asm!(
    "# {x}", // "use" named argument inside comment
    x = in(reg) x,
    options(nostack, readonly, preserves_flags),
);

As we can see here, it's sufficient for preventing compiler from eliminating zeroization and it should be universal across supported targets.

tarcieri · 2024-01-22T01:09:37Z

asm! is only supported on a limited number of platforms, so we still need a portable fallback.

BTW this is more or less what was proposed in #841.

newpavlov · 2024-01-22T01:34:09Z

asm! is only supported on a limited number of platforms

Hm, yes. I thought empty asm! would work even if proper syntax is not yet stable, but it's not the case.

We probably then should use the asm! black box on supported targets and use core::hint::black_box as a best-effort fallback on others. I played with the later on godbolt and on simple examples it works as expected, with a slightly worse code generation (on x86 it adds two unnecessary instrucitons).

As for the fences, I think they are simply not necessary. Lack of explicit memory synchronization may cause other cores to see non-zeroized data with relaxed reads for a short time window, but IIUC it can only happen if the data was already cached by the core.

tarcieri · 2024-01-22T01:43:40Z

We probably then should use the asm! black box on supported targets and use core::hint::black_box as a best-effort fallback on others.

The documentation for core::hint::black_box is pretty adamant that it shouldn't be used for these sorts of applications and I think it would at least invite questions if we were to use it, even if the codegen spotchecks:

https://doc.rust-lang.org/stable/core/hint/fn.black_box.html

"Note however, that black_box is only (and can only be) provided on a best-effort basis. The extent to which it can block optimisations may vary depending upon the platform and code-gen backend used. Programs cannot rely on black_box for correctness, beyond it behaving as the identity function. As such, it must not be relied upon to control critical program behavior. This immediately precludes any direct use of this function for cryptographic or security purposes."

As for the fences, I think they are simply not necessary.

I can't find a good justification for them, so they can probably be removed.

newpavlov · 2024-01-22T02:02:14Z

This is why I wrote black_box being "best effort". According to the rust-lang Zulip discussion linked in the PR, the scary documentation is mostly a way for the Rust team to waiver any responsibility around potential issues regarding surprising black_box codegen. IIUC the main concern is about uses of black_box to enforce const time execution, not about elimination of writes. But it's the most suitable tool we currently have in the language outside of empty asm! blocks. Our use of write_volatile can be considered a misuse since the function is intended for IO memory and thus compiler in theory may eliminate it when it's used on a stack memory.

tarcieri · 2024-01-22T02:04:36Z

@newpavlov based on my discussions with the Unsafe Code Working Group, zeroize's usage of write_volatile is fine. In fact, it lead to changing some of the language in the documentation which suggested otherwise:

rust-lang/rust#60972

cc @RalfJung

RalfJung · 2024-01-22T09:57:21Z

Yeah the volatile write is fine, in the sense of being welldefined (not UB). However, there is no language level guarantee that this will eliminate all copies of a value against an assembly-level attacker. This is all best-effort. That also might explain the fence. Having a proper guarantee first requires the language to understand the concept of a secret. Am 22. Januar 2024 02:04:47 UTC schrieb Tony Arcieri ***@***.***>:

…

@newpavlov based on my discussions with the Unsafe Code Working Group, `zeroize`'s usage of `write_volatile` is fine. In fact, it lead to changing some of the language in the documentation which suggested otherwise: rust-lang/rust#60972 cc @RalfJung -- Reply to this email directly or view it on GitHub: #988 (comment) You are receiving this because you were mentioned. Message ID: ***@***.***>

newpavlov · 2024-01-22T11:27:13Z

@RalfJung
Could you please comment on difference between core::hint::black_box and empty asm! block like here? I wonder why the former emits additional unnecessary instructions. Is it guaranteed that the asm! block makes the value "observed" and that compiler can not eliminate last writes to it?

RalfJung · 2024-01-22T12:34:55Z

asm! just guarantees that that assembly gets emitted, nothing else. black_box doesn't hard guarantee anything, as is stated in its docs.

Volatile writes seem like the most reliable approach to me, but I'm not an expert in what compilers actually will do when it comes to secret management. I can only comment on the language spec, which is not very useful in this case since the spec does not account for secrets.

newpavlov · 2024-01-22T12:42:30Z

asm! just guarantees that that assembly gets emitted

The question is whether compiler is allowed to look inside a provided assembly block or not. If during optimization passes it's allowed to only analyze provided input/output arguments and options (let's forget about register allocation for named arguments for now), then I think it would be a sufficient guarantee for us.

With the empty assembly block we rely on the fact that we do not mark the assembly block as "pure" and pass pointer to zeroized data to it as an input argument. In other words, we make compiler to believe that the value may be observed and this observation may have side effects, thus compiler can not remove zeroization present before the block. An "unnecessarily smart" compiler may peek inside the assembly block and prove that the pointer is actually not used or that the assembly block is "pure", thus applying undesirable optimizations.

Unfortunately, we have already encountered at least one case (honestly, it was a really unpleasant surprise) when compiler "peeks" inside assembly block (it does not allow to use some target feature dependent instructions without enabling appropriate target features), so it would be nice to clarify on what we can and can not rely regarding assembly blocks.

RalfJung · 2024-01-22T12:50:45Z

I'm the wrong person for these questions. I was under the impression that it doesn't "peek" into asm blocks at all.

Cc @Amanieu

newpavlov · 2024-01-22T12:56:57Z

I was under the impression that it doesn't "peek" into asm blocks at all.

Here is the relevant issue: rust-lang/rust#112709

arch64-unknown-linux-gnu does not assume that AES instructions are always available, and attempting to use AES instructions in an inline assembly without enabling the aes target feature will result in an error.

Amanieu · 2024-01-22T14:32:09Z

The question is whether compiler is allowed to look inside a provided assembly block or not. If during optimization passes it's allowed to only analyze provided input/output arguments and options (let's forget about register allocation for named arguments for noц), then I think it would be a sufficient guarantee for us.

The problem isn't about the compiler peeking into the asm. The current implementation as written does guarantee that a copy of the data is zeroed. But nothing about the compiler/optimizer spec is guaranteeing that this is the only copy of the data that exists. You're relying on the optimizer not introducing unnecessary copies, which is something that it is perfectly allowed to do, although it usually won't because that tends to result in worse performance.

tarcieri · 2024-01-22T14:40:40Z

FWIW, much of what @Amanieu is talking about is highlighted in this paper: https://eprint.iacr.org/2023/1713.pdf

We should definitely add stack spilling to the list of caveats documented here:

utils/zeroize/src/lib.rs

Lines 193 to 195 in 160fa8c

    
           //! - Moves and [`Copy`] 
        
           //! - Heap reallocation when using [`Vec`] and [`String`] 
        
           //! - Borrowers of a reference making copies of the data

newpavlov · 2024-01-22T14:52:05Z

You're relying on the optimizer not introducing unnecessary copies

Yes, it's true, but it's a separate issue. The copy zeroization guarantee is a fine start.

oconnor663 changed the title ~~Why does Zeroice use a fence?~~ Why does Zeroize use a fence? Jan 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why does Zeroize use a fence? #988

Why does Zeroize use a fence? #988

oconnor663 commented Dec 1, 2023 •

edited

tarcieri commented Dec 1, 2023

newpavlov commented Jan 22, 2024 •

edited

tarcieri commented Jan 22, 2024

newpavlov commented Jan 22, 2024

tarcieri commented Jan 22, 2024 •

edited

newpavlov commented Jan 22, 2024

tarcieri commented Jan 22, 2024

RalfJung commented Jan 22, 2024 via email

newpavlov commented Jan 22, 2024

RalfJung commented Jan 22, 2024

newpavlov commented Jan 22, 2024 •

edited

RalfJung commented Jan 22, 2024

newpavlov commented Jan 22, 2024 •

edited

Amanieu commented Jan 22, 2024

tarcieri commented Jan 22, 2024

newpavlov commented Jan 22, 2024

Why does Zeroize use a fence? #988

Why does Zeroize use a fence? #988

Comments

oconnor663 commented Dec 1, 2023 • edited

tarcieri commented Dec 1, 2023

newpavlov commented Jan 22, 2024 • edited

tarcieri commented Jan 22, 2024

newpavlov commented Jan 22, 2024

tarcieri commented Jan 22, 2024 • edited

newpavlov commented Jan 22, 2024

tarcieri commented Jan 22, 2024

RalfJung commented Jan 22, 2024 via email

newpavlov commented Jan 22, 2024

RalfJung commented Jan 22, 2024

newpavlov commented Jan 22, 2024 • edited

RalfJung commented Jan 22, 2024

newpavlov commented Jan 22, 2024 • edited

Amanieu commented Jan 22, 2024

tarcieri commented Jan 22, 2024

newpavlov commented Jan 22, 2024

oconnor663 commented Dec 1, 2023 •

edited

newpavlov commented Jan 22, 2024 •

edited

tarcieri commented Jan 22, 2024 •

edited

newpavlov commented Jan 22, 2024 •

edited

newpavlov commented Jan 22, 2024 •

edited