Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merkle tree proof implementation #285

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

vembacher
Copy link
Contributor

@vembacher vembacher commented Jul 19, 2023

Summary

Related to: #283

This adds implementations for:

  • consistency proofs
  • inclusion proofs
  • checkpoint/STH related functionality

The Merkle proofs are essentially ports of the transparency-dev implementations, including the test suite.
The checkpoint related code is based on the implementation in the rekor Go package.

I also changed the Rekor models to use the new SignedCheckpoint type, which implements the serde traits. For now I think this is the only major breaking change, the rest are mostly new or private APIs.

I have not implemented the logic to verify that Checkpoints and the corresponding consistency/inclusion proof are sound together. I want to discuss how to handle this here.

Release Note

  • added functions to relevant Rekor models to support the verification of Merkle inclusion and consistency proofs
  • added addition variants to SigstoreError enum
  • added Checkpoint type to handle verification of checkpoints/STHs
  • changed type of signed_tree_head fiekd from String to Checkpoint
  • added field checkpoint: Option<Checkpoint> to InclusionProof struct

Documentation

  • relevant new features are documented via RustDoc or examples
  • breaking changes primarily affect LogInfo and LogEntry due to changes to struct fields, this should only require minor changes

Todos:

  • add examples
  • add tests for Rekor models verification functions
  • write change-log
  • discuss how to verify that checkpoint + proof are valid together

@lukehinds
Copy link
Member

One for the TODO would be something for examples/

Nice work btw!

@vembacher
Copy link
Contributor Author

I added examples for consistency and inclusion proofs.

I will also add some tests in the future to the methods I added to the Rekor models. What test data should I use for this? As these require log entries + keys.

Also I'm unsure whether my checks to ensure whether proofs and checkpoints are valid together are correct. What is the correct way to do this?

@flavio
Copy link
Member

flavio commented Sep 7, 2023

Sorry, this has been sitting on my review list for a long time. I need some time to get familiar with the feature being implemented 😓

@vembacher
Copy link
Contributor Author

Sorry, this has been sitting on my review list for a long time. I need some time to get familiar with the feature being implemented 😓

No worries! I'm not in a rush to complete this.

@gaetanww
Copy link

I received an error with the current implementation when I tested for the entry with index 25579.

Problem

I received the following error although the verification should have succeeded.

InclusionProofError(MismatchedRoot { expected: "db0822d0508b54bfb770ba5d3394d331bd9a22c91d9a8c0b906e7ea2208903be", got: "4d006aa46efcb607dd51d900b1213754c50cc9251c3405c6c2561d9d6a2f3239" })

Underlying problem

The underlying error is, I believe, related to the assumption that the body is formatted as canonical JSON when it is appended to the log. Indeed, to verify that the body is included in the merkle tree, the implementation currently reserializes the body using a canonical formatter. However, I think the original entry wasn't formatted using a canonical formatter (see files below).

To fix

We can't make the assumption that the body was formatted canonically, so I think the solution should be to keep the "original" base64-decoded body as bytes in the LogEntry struct and use this to verify the inclusion proof instead of reserializing it.

If you agree with the fix, I'm happy to work on it.

To reproduce

The following code:

let rekor_config = Configuration::default();
let rt = tokio::runtime::Runtime::new().unwrap();
let log_entry = rt.block_on(get_log_entry_by_index(&rekor_config, 25579)).unwrap();
let mut encoded = vec![];
let mut ser = serde_json::Serializer::with_formatter(
    &mut encoded,
    olpc_cjson::CanonicalFormatter::new(),
);
log_entry.body.serialize(&mut ser).unwrap();
println!("{}", String::from_utf8(encoded).unwrap());

returns:

{"apiVersion":"0.0.1","kind":"rekord","spec":{"data":{"hash":{"algorithm":"sha256","value":"ce9a7c82f32194995888758cf107ef0cc52e0b8cdce73b4240658ee9e73783cb"}},"signature":{"content":"MGUCMD3oKzgsGnPAkJEXegDIsdlh4BFCQbM6jng4Sy3axY/+2tlK97oe/CkxabT1ZXUqCAIxAJDq+zLfRZZEJD5DvaKhFEu+Jm+jD4UXc3CaZp2MSajiralmtalA6fSGCXjwGfUzOw==","format":"x509","publicKey":{"content":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNrakNDQWhpZ0F3SUJBZ0lVQU0rK0dYRFN5bUNPSW82YmxMMG5EZngxb21nd0NnWUlLb1pJemowRUF3TXcKS2pFVk1CTUdBMVVFQ2hNTWMybG5jM1J2Y21VdVpHVjJNUkV3RHdZRFZRUURFd2h6YVdkemRHOXlaVEFlRncweQpNVEEzTWpnd09ETTNOREphRncweU1UQTNNamd3T0RVM05ERmFNQUF3ZGpBUUJnY3Foa2pPUFFJQkJnVXJnUVFBCklnTmlBQVJjMDMrUU4vTHBrOGpqUFQwTmV5a01ucm9mMnpZUkJxNm05ei9TMXhRSzduZnZhU3grRjUrTEtwN3gKR2ExbHY2SWNvRXdwUHA2MUdsYnd5S0VQVWJLdzJrYnJyRVpPMnhKV3kxb0VEUHBYMlJqcTBYS0RZcEF5Zi9mQwoyZzJjSnVtamdnRW5NSUlCSXpBT0JnTlZIUThCQWY4RUJBTUNCNEF3RXdZRFZSMGxCQXd3Q2dZSUt3WUJCUVVICkF3TXdEQVlEVlIwVEFRSC9CQUl3QURBZEJnTlZIUTRFRmdRVTRBUDhtTkI4ejhSZFJyTVVLZ1A2Mm0xUFErd3cKSHdZRFZSMGpCQmd3Rm9BVXlNVWRBRUdhSkNreVVTVHJEYTVLN1VvRzArd3dnWTBHQ0NzR0FRVUZCd0VCQklHQQpNSDR3ZkFZSUt3WUJCUVVITUFLR2NHaDBkSEE2THk5d2NtbDJZWFJsWTJFdFkyOXVkR1Z1ZEMwMk1ETm1aVGRsCk55MHdNREF3TFRJeU1qY3RZbVkzTlMxbU5HWTFaVGd3WkRJNU5UUXVjM1J2Y21GblpTNW5iMjluYkdWaGNHbHoKTG1OdmJTOWpZVE0yWVRGbE9UWXlOREppT1daallqRTBOaTlqWVM1amNuUXdIZ1lEVlIwUkFRSC9CQlF3RW9FUQpZM1JoWkdWMVFHZHRZV2xzTG1OdmJUQUtCZ2dxaGtqT1BRUURBd05vQURCbEFqRUE3TTJwSzhRUFRrSGs1bzZ0CmdnampZdjBLV1BUajRKUTAwU3RjR0xqa1g3SU1iNC9HdXpYRkQ4czZDOEd3NmpwMEFqQW1Xa2JROTVsMzlnUGQKR2pjUjBSQURaT0dYb0NPQURwOE5lSzhBL2dKdWdnR0ZINHZYZ2l1ODJsQm5MOEZSc09jPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="}}}}

when a direct call to the api:

curl 'https://rekor.sigstore.dev/api/v1/log/entries?logIndex=25579' | jq -r '.[ (keys_unsorted)[0] ].body' | base64 -d

returns:

{"apiVersion":"0.0.1","spec":{"data":{"hash":{"algorithm":"sha256","value":"ce9a7c82f32194995888758cf107ef0cc52e0b8cdce73b4240658ee9e73783cb"}},"signature":{"content":"MGUCMD3oKzgsGnPAkJEXegDIsdlh4BFCQbM6jng4Sy3axY/+2tlK97oe/CkxabT1ZXUqCAIxAJDq+zLfRZZEJD5DvaKhFEu+Jm+jD4UXc3CaZp2MSajiralmtalA6fSGCXjwGfUzOw==","format":"x509","publicKey":{"content":"LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUNrakNDQWhpZ0F3SUJBZ0lVQU0rK0dYRFN5bUNPSW82YmxMMG5EZngxb21nd0NnWUlLb1pJemowRUF3TXcKS2pFVk1CTUdBMVVFQ2hNTWMybG5jM1J2Y21VdVpHVjJNUkV3RHdZRFZRUURFd2h6YVdkemRHOXlaVEFlRncweQpNVEEzTWpnd09ETTNOREphRncweU1UQTNNamd3T0RVM05ERmFNQUF3ZGpBUUJnY3Foa2pPUFFJQkJnVXJnUVFBCklnTmlBQVJjMDMrUU4vTHBrOGpqUFQwTmV5a01ucm9mMnpZUkJxNm05ei9TMXhRSzduZnZhU3grRjUrTEtwN3gKR2ExbHY2SWNvRXdwUHA2MUdsYnd5S0VQVWJLdzJrYnJyRVpPMnhKV3kxb0VEUHBYMlJqcTBYS0RZcEF5Zi9mQwoyZzJjSnVtamdnRW5NSUlCSXpBT0JnTlZIUThCQWY4RUJBTUNCNEF3RXdZRFZSMGxCQXd3Q2dZSUt3WUJCUVVICkF3TXdEQVlEVlIwVEFRSC9CQUl3QURBZEJnTlZIUTRFRmdRVTRBUDhtTkI4ejhSZFJyTVVLZ1A2Mm0xUFErd3cKSHdZRFZSMGpCQmd3Rm9BVXlNVWRBRUdhSkNreVVTVHJEYTVLN1VvRzArd3dnWTBHQ0NzR0FRVUZCd0VCQklHQQpNSDR3ZkFZSUt3WUJCUVVITUFLR2NHaDBkSEE2THk5d2NtbDJZWFJsWTJFdFkyOXVkR1Z1ZEMwMk1ETm1aVGRsCk55MHdNREF3TFRJeU1qY3RZbVkzTlMxbU5HWTFaVGd3WkRJNU5UUXVjM1J2Y21GblpTNW5iMjluYkdWaGNHbHoKTG1OdmJTOWpZVE0yWVRGbE9UWXlOREppT1daallqRTBOaTlqWVM1amNuUXdIZ1lEVlIwUkFRSC9CQlF3RW9FUQpZM1JoWkdWMVFHZHRZV2xzTG1OdmJUQUtCZ2dxaGtqT1BRUURBd05vQURCbEFqRUE3TTJwSzhRUFRrSGs1bzZ0CmdnampZdjBLV1BUajRKUTAwU3RjR0xqa1g3SU1iNC9HdXpYRkQ4czZDOEd3NmpwMEFqQW1Xa2JROTVsMzlnUGQKR2pjUjBSQURaT0dYb0NPQURwOE5lSzhBL2dKdWdnR0ZINHZYZ2l1ODJsQm5MOEZSc09jPQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg=="}}},"kind":"rekord"}

Note that the the "kind" field is at the end of the json in the API and in second place in the canonicalized format.

@tnytown
Copy link
Contributor

tnytown commented Feb 16, 2024

Thanks for the thorough triage @gaetanww!

I won't speak too much on this as I haven't tried this out myself, but I suspect that the issue you're seeing regarding canonicalization may be related to the canonicalizer being used. This implementation currently uses olpc_cjson, while Rekor uses RFC 8785 canonicalization. (OLPC canonical JSON is regrettably distinct from RFC 8785 canonical JSON.) In the sign module, we've used json-syntax to canonicalize the LogEntry body in accordance with RFC 8785 rules.

@gaetanww
Copy link

gaetanww commented Feb 16, 2024

np! Thanks for the pointer, yes that could be it. I think the following code should work (using the json_syntax crate with features ["canonicalize", "serde_json"]). I won't have time to try it out today though.

pub fn verify_inclusion(&self, rekor_key: &CosignVerificationKey) -> Result<(), SigstoreError> {
        self.verification
            .inclusion_proof
            .as_ref()
            .ok_or(UnexpectedError("missing inclusion proof".to_string()))
            .and_then(|proof| {
                // encode as canonical JSON
                let mut json_value = json_syntax::Value::from_serde_json(serde_json::to_value(&self.body)?);
                json_value.canonicalize();
                let encoded_entry = json_value.compact_print().to_string().into_bytes();
                proof.verify(&encoded_entry, rekor_key)
            })
    }

@vembacher
Copy link
Contributor Author

I hopefully should be able to take a closer look at this some time next week, and apply the fix/update the PR.

@gaetanww
Copy link

gaetanww commented Feb 16, 2024

One more data point on issue above. It works perfectly without modification with index: 71844338, however, my hand-rolled merkle tree verification now fails. My hand-rolled implementation is just using the b64 decoded body in the json, so the underlying issue might not be canonicalization.

@vembacher
Copy link
Contributor Author

I haven't looked at it in a few months, but if I remember correctly there are two types of indices. With one being the index in the log and one being the index in the tree. So you might be using the incorrect index.

@vembacher
Copy link
Contributor Author

@gaetanww is your issue resolved?

@gaetanww
Copy link

Sorry for the slow response. I made it work for the latest rekor records (e.g., index 71844338), the early one (25579) still fails AFAICT.

@vembacher
Copy link
Contributor Author

vembacher commented Mar 18, 2024

I did some investigating, I think @gaetanww your original analysis was correct and the formatting of the log entry is the issue, as both serializers produce identical outputs for me.

The Go client also uses the Base64 encoded entry provided by the log, is this the correct way to handle this? I feel like this might conflict with the description on how SETs are supposed to be verified.

@gaetanww
Copy link

Yes, the documentation does conflict with the actual definition. There should at least be a comment in the go code to explain why it's implemented that way.
I think we should raise the point with the maintainers of the cosign project. I'll open an issue.

Relates to: sigstore#283

Signed-off-by: Victor Embacher <victor@embacher.xyz>
…added some functionality.

Relates to: sigstore#283

Signed-off-by: Victor Embacher <victor@embacher.xyz>
Signed-off-by: Victor Embacher <victor@embacher.xyz>
…tions.

Signed-off-by: Victor Embacher <victor@embacher.xyz>
Signed-off-by: Victor Embacher <victor@embacher.xyz>
Signed-off-by: Victor Embacher <victor@embacher.xyz>
@exFalso
Copy link

exFalso commented Jun 6, 2024

This is failing on 32bit targets because of the use of usize for bitwise arithmetic. Changing to u64 fixes it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants