Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best practice for proving loop termination in firmware? #186

Open
jadephilipoom opened this issue May 6, 2021 · 22 comments
Open

Best practice for proving loop termination in firmware? #186

jadephilipoom opened this issue May 6, 2021 · 22 comments

Comments

@jadephilipoom
Copy link
Contributor

Let's say I'm trying to prove termination for a loop that looks like this:

done = 0
while (!done) {
  status = MMIOREAD (status_register_addr) ;
  done = status & 1;
}

Based on the hardware specification, I know that the loop will not run forever because the "done" bit in the status register will eventually be set to 1 (guaranteed after a certain maximum number of cycles, which translates to a maximum number of MMIO reads). I've gotten the proof working by writing my ext_spec as essentially trace-allowed-by-hardware-spec -> postcondition. However, looking at the examples in the repo (mainly FE310CSemantics.v), the ext_specs don't have any preconditions. I'm wondering if there's a better way to do what I want to do here, or if I'm violating a convention and setting myself up for later problems.

@andres-erbsen
Copy link
Contributor

Our functions always terminate regardless of mmio input, the postconditions have a separare timeout case. If you can rely on the peripheral device completing its job within a fixed bound of mmio interactions, your way is better (or at least so I have believed without trying it out too hard).

@jadephilipoom
Copy link
Contributor Author

Where is the separate timeout case? Looking at the WeakestPrecondition.cmd_body clause for while, it really seems like I have to prove termination:

| cmd.while e c =>
exists measure (lt:measure->measure->Prop) (inv:measure->trace->mem->locals->Prop),
Coq.Init.Wf.well_founded lt /\
(exists v, inv v t m l) /\
(forall v t m l, inv v t m l ->
bind_ex b <- dexpr m l e;
(word.unsigned b <> 0%Z -> rec c t m l (fun t' m l =>
exists v', inv v' t' m l /\ lt v' v)) /\
(word.unsigned b = 0%Z -> post t m l))

Or do you mean something other than WeakestPrecondition?

@andres-erbsen
Copy link
Contributor

@jadephilipoom
Copy link
Contributor Author

Ah, I think I now understand -- by "our programs" you mean the lightbulb functions, not bedrock2 programs in general. Okay, that makes sense to me. I think for my use-case I prefer to experiment with relying on the hardware guarantees to prove termination, but good to know why I differ from the examples here. Thanks!

@jadephilipoom
Copy link
Contributor Author

jadephilipoom commented May 11, 2021

Just a follow-up about this in case someone is curious later on. Using the hardware spec to prove loop termination works just fine for proving the bedrock2 programs correct, but I think it's unworkable with the compiler proofs. I'm loosely following the example from MMIO.v, trying to prove an analog of this lemma:

Lemma compile_ext_call_correct: forall resvars extcall argvars,
FlatToRiscvCommon.compiles_FlatToRiscv_correctly
(@FlatToRiscvDef.compile_ext_call compilation_params)
(FlatImp.SInteract resvars extcall argvars).

I run into problems because at some point I have a proof state that says essentially:

l : locals
t : trace
m : mem
H1 : forall x : word, read_valid t x -> outcome map.empty [x]
H2 : forall mRecv val,
       outcome mRecv [val] ->
       postH (map.empty, READ, [addr], (mRecv, [val]) :: t) m (map.put l name val)
w : word
===================
exists finalTrace finalMem finalLocals,
postH finalTrace finalMem finalLocals
/\ map.extends (map.put initialLocals name w) finalLocals
/\ <other things to prove>

Essentially, I need to use the conclusion of H1 to proceed with the proof, and it needs to be specialized to w (the value I just loaded via an MMIO read) for the map.extends in the conclusion to be provable. But I don't know anything about w, and can't prove it's valid, so the proof is stuck. I tried a few different formulations but couldn't figure out a workaround for this.

In summary: the compiler proofs require that ext_spec place no restrictions on the value being read for the postcondition to hold; ext_spec must imply that for all values read, the postcondition holds, so it's not actually doable to use ext_spec for control-flow reasoning (at least without changes to compiler proofs).

@andres-erbsen
Copy link
Contributor

I'm surprised by this issue, I fully expect that the compiler should be able to pass through extspec assumptions from hardware to software.

@andres-erbsen andres-erbsen reopened this May 11, 2021
@jadephilipoom
Copy link
Contributor Author

It's definitely possible that I'm missing something or doing an unsafe rewrite somewhere. To provide a little more detail, then: I'm trying to prove a compiles_FlatToRiscV_correctly statement. As best I can tell, the map.extends comes from the goodMachine clause in compiles_FlatToRiscV_correctly:

Definition compiles_FlatToRiscv_correctly
(f: funname_env Z -> Z -> Z -> stmt -> list Instruction)
(s: stmt): Prop :=
forall e_impl_full initialTrace initialMH initialRegsH initialMetricsH postH,
exec e_impl_full s initialTrace (initialMH: mem) initialRegsH initialMetricsH postH ->
forall (g: GhostConsts) (initialL: MetricRiscvMachine) (pos: Z),
map.extends e_impl_full g.(e_impl) ->
good_e_impl g.(e_impl) g.(e_pos) ->
fits_stack g.(rem_framewords) g.(rem_stackwords) g.(e_impl) s ->
f g.(e_pos) pos (bytes_per_word * g.(rem_framewords)) s = g.(insts) ->
stmt_not_too_big s ->
valid_FlatImp_vars s ->
pos mod 4 = 0 ->
(word.unsigned g.(program_base)) mod 4 = 0 ->
initialL.(getPc) = word.add g.(program_base) (word.of_Z pos) ->
g.(p_insts) = word.add g.(program_base) (word.of_Z pos) ->
goodMachine initialTrace initialMH initialRegsH g initialL ->
runsTo initialL (fun finalL => exists finalTrace finalMH finalRegsH finalMetricsH,
postH finalTrace finalMH finalRegsH finalMetricsH /\
finalL.(getPc) = word.add initialL.(getPc)
(word.of_Z (4 * Z.of_nat (List.length g.(insts)))) /\
map.only_differ initialL.(getRegs)
(union (of_list (modVars_as_list Z.eqb s)) (singleton_set RegisterNames.ra))
finalL.(getRegs) /\
goodMachine finalTrace finalMH finalRegsH g finalL).

I'll try to produce a minimal example.

@jadephilipoom
Copy link
Contributor Author

jadephilipoom commented May 11, 2021

Minimal example pushed to https://github.com/jadephilipoom/bedrock2/tree/hardware-guarantees

Diff from master (3084575):

diff --git a/bedrock2/src/bedrock2/FE310CSemantics.v b/bedrock2/src/bedrock2/FE310CSemantics.v
index 46935987..3513ca66 100644
--- a/bedrock2/src/bedrock2/FE310CSemantics.v
+++ b/bedrock2/src/bedrock2/FE310CSemantics.v
@@ -39,6 +39,9 @@ Section WithParameters.
   Definition isMMIOAligned (n : nat) (addr : parameters.word) :=
     n = 4%nat /\ word.unsigned addr mod 4 = 0.
 
+  (* hardware spec : MMIO reads are always 0 *)
+  Definition read_valid (val : parameters.word) : Prop := val = word.of_Z 0.
+
   Definition ext_spec (t : bedrock2_trace) (mGive : parameters.mem) a (args: list parameters.word) (post:parameters.mem -> list parameters.word -> Prop) :=
     if String.eqb "MMIOWRITE" a then
       exists addr val,
@@ -49,7 +52,7 @@ Section WithParameters.
       exists addr,
         args = [addr] /\
         (mGive = Interface.map.empty /\ isMMIOAddr addr /\ word.unsigned addr mod 4 = 0) /\
-        forall val, post Interface.map.empty [val]
+        forall val, read_valid val -> post Interface.map.empty [val]
     else False.
 
   Global Instance semantics_parameters  : Semantics.parameters :=
@@ -75,7 +78,7 @@ Section WithParameters.
       | H: exists _, _ |- _ => destruct H
       | H: _ /\ _ |- _ => destruct H
       | H: False |- _ => destruct H
-    end; subst; eauto 8 using Properties.map.same_domain_refl.
+    end; subst; eauto 10 using Properties.map.same_domain_refl.
   Qed.
 
   Global Instance ok : Semantics.parameters_ok semantics_parameters.
diff --git a/compiler/src/compilerExamples/MMIO.v b/compiler/src/compilerExamples/MMIO.v
index e929eab4..c0c78026 100644
--- a/compiler/src/compilerExamples/MMIO.v
+++ b/compiler/src/compilerExamples/MMIO.v
@@ -470,6 +470,7 @@ Section MMIO1.
       end.
       cbv [ext_spec FlatToRiscvCommon.Semantics_params FlatToRiscvCommon.ext_spec FE310CSemantics.ext_spec] in Ex.
       simpl in *|-.
+      assert (FE310CSemantics.read_valid (word.of_Z 0)) by reflexivity.
 
       rewrite E in *.
       destruct ("MMIOREAD" =? action)%string eqn:EE in Ex; try contradiction.

The assertion in MMIO.v helps some earlier conditions go through by proving there exists some possible valid value; this is something I can prove with my real example as well if needed.

@andres-erbsen
Copy link
Contributor

I think you want to add a similar assumption to https://github.com/mit-plv/riscv-coq/blob/master/src/riscv/Platform/MinimalMMIO.v#L63

@samuelgruetter
Copy link
Contributor

I'd expect that you need to change the semantics of the RiscvMachine to state that read_valid t x holds for all MMIO reads, probably a change in riscv.Platform.MinimalMMIO.nonmem_load is needed

@samuelgruetter
Copy link
Contributor

... and as I post this, I see that Andres thinks so too, but 2min before me 😅

@jadephilipoom
Copy link
Contributor Author

jadephilipoom commented May 11, 2021

Thanks both! I'm testing it out now and will report back 🙂
[edit: adding the precondition breaks the proof of interpret_action_total in MinimaMMIO.v, so my quick test might not be quite so quick]

@jadephilipoom
Copy link
Contributor Author

It works! 🎉 Thanks for the help!

If I can make it compatible with existing examples, would you all be willing to consider a PR to riscv-coq that adds a field to MMIOSpec for hardware guarantees? Essentially it lets you insert a Prop that, given the trace, the current read address, and the value that has been read, states whether the read is valid. For existing examples it would just be True.

@jadephilipoom
Copy link
Contributor Author

jadephilipoom commented May 11, 2021

Concretely, here's the diff: https://gist.github.com/jadephilipoom/1e8e432d27a6e41162b0ae023ffa9994 (have not tested examples yet)

@samuelgruetter
Copy link
Contributor

I think such a PR would make sense (once you've connected everything to confirm that it works).
Regarding the line

    (* there exists at least one valid MMIO read value (set is nonempty) *)
    /\ (exists v : HList.tuple byte n, MMIOReadOK n (getLog mach) a (signedByteTupleToReg v))

I guess this will show up as a proof obligation when proving correctness of the MMIO compiler (in compilerExamples.MMIO). Did you already encounter this obligation, and how did you prove it?
That line was probably needed to prove riscv.Spec.MetricPrimitives.mcomp_sane, and if we need to change that, things could get interesting 😅

@jadephilipoom
Copy link
Contributor Author

jadephilipoom commented May 11, 2021

I think such a PR would make sense (once you've connected everything to confirm that it works).

Cool, I'll make one then.

I guess this will show up as a proof obligation when proving correctness of the MMIO compiler (in compilerExamples.MMIO). Did you already encounter this obligation, and how did you prove it?

I just made my own ext_spec have the same structure -- require that there's at least one possible valid trace after this read (i.e. that the read arguments make sense) and separately that the postconditions holds for all valid reads.

I haven't edited the existing MMIO.v yet, but since MMIOReadOK would be just fun _ _ _ _ => True, the condition should be trivially satisfied by any word.

@samuelgruetter
Copy link
Contributor

I just made my own ext_spec have the same structure -- require that there's at least one possible valid trace after this read (i.e. that the read arguments make sense) and separately that the postconditions holds for all valid reads.

Aah I see, so you just push the problem up to the next layer... 😉
So my question becomes: Have you already proven a non-trivial "there exists at least one valid value to be read" condition in a program logic proof? If you're in a loop like the one you posted at the beginning of this issue, wouldn't this makeit impossible to use your trick of using the assumption that the firmware returns a valid value after a bounded number of pollings?

@jadephilipoom
Copy link
Contributor Author

So my question becomes: Have you already proven a non-trivial "there exists at least one valid value to be read" condition in a program logic proof?

Yes, I've done one for a small firmware program that interfacing with a peripheral block that takes a single-register input via an MMIO write and after an unknown but upper-bounded number of cycles signals it's done via a flag in a separate "status" register which indicates the output register is now safe to read. My example specification for what the circuit actually does is just "add 1 to the input", but you can plug in any word -> word function.The firmware program writes the value, reads the status register until it sees the "done" flag, and then reads the output register and returns the output. The spec of the program is that, assuming the peripheral block behaves as expected, the program will return the correct result according to the circuit spec. I have to reason about the hardware guarantees in program logic to prove that the firmware program's loop terminates.

The code is here (in a branch on my fork, can't be merged to main branch yet because it's on top of the changes I've made to riscv-coq):
https://github.com/jadephilipoom/oak-hardware/blob/7dfee319d10e52cba51e63579ca3a3744e159890/investigations/bedrock2/IncrementWait

Possible files of interest:
IncrementWait.v : implementation of firmware program in bedrock2
IncrementWaitSemantics.v : Semantics.parameters instance and state-machine reasoning for the circuit
IncrementWaitProperties.v : spec + proof of the firmware program
IncrementWaitMMIO.v : analog of MMIO.v, contains compiles_FlatToRiscV_correctly proof

The signalling structure here is designed to be a minimal imitation of the signalling for an AES block I'm trying to do some firmware verification for. I've got some bedrock2 code + proofs for that (which is in the main branch), although I'll need to change some things so it works with the compiler proofs (I'm using the small example circuit to figure out how the core logical structure works without dealing with the many registers and more complicated logic of the full block). That example would be very non-trivial, and I am pretty confident that it will work given that the smaller one does.

If you're in a loop like the one you posted at the beginning of this issue, wouldn't this make it impossible to use your trick of using the assumption that the firmware returns a valid value after a bounded number of pollings?

No, I successfully use the trick on top of the proposed riscv-coq change with the program here: https://github.com/jadephilipoom/oak-hardware/blob/7dfee319d10e52cba51e63579ca3a3744e159890/investigations/bedrock2/IncrementWait/IncrementWaitProperties.v#L291

In the software proof, you end up with two goals: 1) there exists some valid read and 2) given a read value and the fact that it was valid, your postcondition holds. You can see that in the helper lemma for read interactions:
https://github.com/jadephilipoom/oak-hardware/blob/7dfee319d10e52cba51e63579ca3a3744e159890/investigations/bedrock2/IncrementWait/IncrementWaitProperties.v#L74

The "valid read" condition for my state-machine reasoning is, in prose: "given the trace so far t and the read address addr, there exists some state s such that s is a possible final state after t is executed (exists s, execution t s) and also there exists a register r such that the address of r is addr and the combination of s, r, and the value that was read is valid according to the read_step condition". The read_step condition destructs the cases based on the register and state and computes a Prop with restrictions on the value and the new state. So ultimately, proving goal (1) ("exists some valid read") amounts to saying that the address is a valid register address, a read is permitted on this register given the trace so far, and that the restrictions read_step places on the value that is read do not exclude all possible values -- actually things I do want my software proof to guarantee!

In goal (2) ("postcondition holds for all valid reads") you already have the guarantee in context that the read was valid -- this allows you to use things like "I know my circuit will not say it's still busy for more than N cycles" to prove loop termination etc.

Sorry for the essay, just wanted to talk you through the details of how I've set this up!

@samuelgruetter
Copy link
Contributor

Thanks for the details, so I'll try to summarize my confusion and how it got resolved: I was assuming you're doing 1), but in fact you're doing 2):

  1. Whenever you call an MMIO read, you get to assume a "rely" (condition on the trace produced so far, without the new event) saying that the device behaved according to its spec (eg, not more than n consecutive polls returning "busy"). In this setting, you might make MMIO read calls without knowing whether a valid next state exists, but if none exists, the "rely" you get to assume is a contradiction. Using this approach would not work with your PR.
  2. The ext_spec of MMIO reads specifies that it can only return "busy" if there were less than n previous calls returning "busy", so you can make sure to only call an MMIO read if you know that the device has a valid next state

@andres-erbsen
Copy link
Contributor

I think the overall approach here makes sense, and I'm fine with merging this parameter.

I'm not sure whether exposing the "exists and forall" structure to program logic proofs is the most ergonomic approach (but it's not being merged to bedrock2, so idc). Intead I imagine we could prove a lemma about your extspec saying that if it holds for some parameters then the postcondition parameter must be non-vacuous, use a context variable representing lemma to prove totality of riscv/bedrock2 semantics, and leave software proofs with more natural goals...

@jadephilipoom
Copy link
Contributor Author

I'm not sure whether exposing the "exists and forall" structure to program logic proofs is the most ergonomic approach (but it's not being merged to bedrock2, so idc). Intead I imagine we could prove a lemma about your extspec saying that if it holds for some parameters then the postcondition parameter must be non-vacuous, use a context variable representing lemma to prove totality of riscv/bedrock2 semantics, and leave software proofs with more natural goals...

This seems like a good idea; I'm not sure I know how to implement it but would be interested in this kind of layer. In my particular use case, the existential goals are pretty easy to dispatch.

@samuelgruetter
Copy link
Contributor

prove a lemma about your extspec saying that if it holds for some parameters then the postcondition parameter must be non-vacuous

That sounds interesting, but how to make sure that the extspec satisfies this lemma? The only way I can think of would be to include in the extspec that existential that we'd like to get rid of...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants