Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: fix fetching updated targets from TUF root #1921

Merged
merged 11 commits into from May 31, 2022

Conversation

asraa
Copy link
Contributor

@asraa asraa commented May 25, 2022

Signed-off-by: Asra Ali asraa@google.com

Summary

This change refactors the cosign TUF client and hopefully aims to simplify the logic behind embedded TUF repository and targets, and the writeable on-disk/in-memory repository and targets. Roughly, I structured this so that the cosign TUF client contains (1) A client.LocalStore to hold TUF repository metadata updates and (2) A targetImpl to hold downloaded and cached target files.

Previously, the two were "out of sync" -- when starting with an embedded workflow, we would create an embedded root repository and an embedded targetImpl. However, the embedded targetImpl ONLY retrieved embedded targets, not the updated ones!

b, err := embeddedRootRepo.ReadFile(path.Join("repository", "targets", p))

Embedded workflows never used their updated targets, despite writing them to the underlying storage, and failed to retrieve new targets referenced by updated metadata.

Conceptually now, there are 4 main objects in the TUF client:

  1. An memoryCache targetImpl: A map that stores target files by name.
  2. A diskCache targetImpl: Stores targets on disk.
  3. An embeddedWrapper targetImpl: This wraps the underlying (memory, disk) cache. By default it Gets embedded targets. If any targets were downloaded and Set, then Get transfers to retrieving from the underlying cache.
  4. An embeddedLocalStore that wraps either a MemoryLocalStore or FileLocalStore. Similar to above, by default it gets embedded repo metadata, until any new metadata needed to be downloaded and set. Then any GetMeta operations get the cached metadata.

Other:

  • To make testing easier, interfaced the default embedded repo and default remote mirror.
  • Tested the addition of a new target in an update. The same test at HEAD fails to find the target.
  • I tested this against a binary of cosign that pointed to the default mirror as brokenv3's GCS bucket and it successfully found the new target fulcio_interemediate_v1.crt.pem

If this design looks good, I will continue to add testing and clean-up test code. It's a pain to manually write out TUF updates, so I'd like to unify those functions and also add testing for consistent snapshotting (note brokenv3 enabled that so I'm 90% sure that works with this fix).

Ticket Link

Fixes #1899

Release Note

* bug fix: Fixes bug in retrieving newly added verification material from TUF root.

cc @haydentherapper @dlorenc @znewman01

@codecov-commenter
Copy link

codecov-commenter commented May 25, 2022

Codecov Report

Merging #1921 (e7bcb69) into main (5f09c42) will decrease coverage by 0.01%.
The diff coverage is 59.09%.

@@            Coverage Diff             @@
##             main    #1921      +/-   ##
==========================================
- Coverage   34.01%   34.00%   -0.02%     
==========================================
  Files         153      153              
  Lines        9977     9981       +4     
==========================================
  Hits         3394     3394              
- Misses       6202     6208       +6     
+ Partials      381      379       -2     
Impacted Files Coverage Δ
cmd/cosign/cli/fulcio/fulcioroots/fulcioroots.go 36.36% <0.00%> (ø)
pkg/cosign/tlog.go 29.12% <18.18%> (-0.59%) ⬇️
pkg/cosign/tuf/client.go 62.80% <67.88%> (+1.12%) ⬆️
pkg/cosign/tuf/store.go 73.33% <0.00%> (-6.67%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5f09c42...e7bcb69. Read the comment docs.

@asraa asraa force-pushed the fix-tuf-update branch 3 times, most recently from 447e794 to 39cf2af Compare May 25, 2022 18:15
Signed-off-by: Asra Ali <asraa@google.com>

add comment

Signed-off-by: Asra Ali <asraa@google.com>

update

Signed-off-by: Asra Ali <asraa@google.com>

update

Signed-off-by: Asra Ali <asraa@google.com>

possible fix windows

Signed-off-by: Asra Ali <asraa@google.com>

lint

Signed-off-by: Asra Ali <asraa@google.com>

fix windows maybe

Signed-off-by: Asra Ali <asraa@google.com>

fix close

Signed-off-by: Asra Ali <asraa@google.com>
Copy link
Contributor

@znewman01 znewman01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure you saw this, but test failure/lint failure looks related.

Also, your PR description is great---can you copy a bunch of it inline?

Release note is a little inappropriate for end users: I think "TUF client' is an implementation detail.

Set(string, []byte) error
type localInitFunc func() (client.LocalStore, error)

type embeddedLocalStore struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nitpick: embeddedLocalStore makes me think that it's only an embedded local store.

Above all applies to embeddedWrapper too.

pkg/cosign/tuf/client.go Outdated Show resolved Hide resolved
pkg/cosign/tuf/client.go Outdated Show resolved Hide resolved
pkg/cosign/tuf/client.go Outdated Show resolved Hide resolved
pkg/cosign/tuf/client.go Show resolved Hide resolved
pkg/cosign/tuf/client.go Outdated Show resolved Hide resolved
}

type memoryCache struct {
targets map[string][]byte
var GetEmbedded = func() fs.FS {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we call GetEmbedded each time, rather than locking it in at init time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then if we're going to do that, we may as well make it a parameter of the init method rather than something that needs to be mocked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmm I don't want to lock it in because tests need to modify that on an individual basis

Otherwise I would need to pre-hook test runs, I think. But correct me if I'm wrong about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err, doesn't each test instantiate its own TUF client (via NewFromEnv)? You should be able to move the GetEmbedded call into initializeTUF or wrapEmbedded/wrapEmbeddedLocal just the first time. I also don't think it's that bad to have a separate constructor that doesn't access any environment variables that you could explicitly pass the fake FS into for tests.

Copy link
Contributor Author

@asraa asraa May 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I misunderstood what you meant by "init time" -- I thought that meant Golang "init" which happens once globally.

I like the initialize idea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather pass this in to the constructor explicitly

return nil, errors.New("fs.ReadFileFS unimplemented for embedded repo")
}
for _, entry := range entries {
if entry.Type().IsRegular() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does IsRegular matter? Is this just skipping

I might prefer

for _, entry := range entries {
  if !entry.Type().IsRegular() {
    continue  // skip bad files <-- but be more descriptive
  } 
  // ...
}

as it emphasizes the skipping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because this is the metadata store, I need to avoid copying over the target directory -- that would have ModeDir set. Also just in case there's executables or something funky. I like your skipping paradigm!

pkg/cosign/tuf/client.go Outdated Show resolved Hide resolved
pkg/cosign/tuf/client_test.go Show resolved Hide resolved
@asraa
Copy link
Contributor Author

asraa commented May 25, 2022

Thanks @znewman01 for the really helpful comments! Addressed most of them I think, closing out trivial ones -- still trying to figure out the failures with the policy...

@asraa asraa force-pushed the fix-tuf-update branch 3 times, most recently from 6794e51 to dccbdeb Compare May 26, 2022 14:34
Signed-off-by: Asra Ali <asraa@google.com>

update fix

Signed-off-by: Asra Ali <asraa@google.com>

update and add some debug

Signed-off-by: Asra Ali <asraa@google.com>

add debug

Signed-off-by: Asra Ali <asraa@google.com>

 no cache

Signed-off-by: Asra Ali <asraa@google.com>

remove debug

Signed-off-by: Asra Ali <asraa@google.com>
@asraa asraa force-pushed the fix-tuf-update branch 2 times, most recently from b16f38d to 9b337a5 Compare May 26, 2022 16:46
@@ -44,6 +46,10 @@ const (
SigstoreNoCache = "SIGSTORE_NO_CACHE"
)

var GetRemoteRoot = func() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be publicly exported, or can it be a private variable in the package?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of want to make this a public global flag eventually! That would make it easy to point to preprod/staging environment. I can make it a follow up issue if you think so

As long as it's a func for now I'm fine, because I need to configure it in the tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding this to the tracking issue explicitly #1935

func (m *memoryCache) Set(p string, b []byte) error {
if m.targets == nil {
m.targets = map[string][]byte{}
func (e embeddedLocalStore) GetMeta() (map[string]json.RawMessage, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: How do we use the embedded metadata (as in, what's shipped with Cosign)? Isn't it only to start the chain of TUF roots?

Would it be reasonable to only store root.json, and remove the rest? Or is go-tuf expecting a complete repository? (I thought there was support now for initializing only from a root)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm just wondering if we can simplify the logic around embedded wrapping. We have this tri-state - Either it's in in the embedded, or it's in either memory or on disk. If we only access a file or two from embedded, just for initialization, we could potentially separate embedded from LocalStore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

go-tuf is not expecting a complete repository, but we don't want to touch the remote unless we need an update so we want the complete repository.

it's to act as a starting local state, i guess.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The timestamp will almost always be expired, so won't we always need to hit the remote?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

almost always yeah, fwiw i'm willing to reconsider this in general and simplify our embedded repo state to only contain the root.json because that would align a lot more closely with normal go-tuf flows and avoid all this complex overlayed logic on whether or not we need to hit the remote...

but yeah the original idea was that if the state of the embedded repo was good we wouldn't fetch. at that point the timestamp had a fairly long expiration -- 3 weeks or so

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we're likely going to be moving to more frequent updates, I'm not sure the embedded non-root metadata adds much value. Do you think it would simply the logic here? Would it remove the embeddedWrappers completely?

If you want to save a bit of remote fetching, is it possible to initialize just the LocalStore containing the targets? Then what's embedded what just be the root and targets. When we fetch the new metadata and compare target files hashes, go-tuf would find that all are locally downloaded. But this is only saving a single initial fetch, so that might not be worth it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried this solution! Only the root is in the embedded, but we also embed targets and use them ONLY IF their hashes are equal to what we see in the targets.json.

Signed-off-by: Asra Ali <asraa@google.com>
znewman01
znewman01 previously approved these changes May 31, 2022
Copy link
Contributor

@znewman01 znewman01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think once we backport the test fixes from #1932 this is pretty close

}

type memoryCache struct {
targets map[string][]byte
var GetEmbedded = func() fs.FS {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather pass this in to the constructor explicitly

// Unfortunately go:embed appears to somehow replace our line endings on windows, we need to switch them back.
// It should theoretically be safe to do this everywhere - but the files only seem to get mutated on Windows so
// let's only change them back there.
if runtime.GOOS == "windows" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What hapepned to this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added!

Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
Signed-off-by: Ville Aikas <vaikas@chainguard.dev>
asraa added 2 commits May 31, 2022 10:00
Signed-off-by: Asra Ali <asraa@google.com>
Signed-off-by: Asra Ali <asraa@google.com>
@asraa
Copy link
Contributor Author

asraa commented May 31, 2022

I think I'd rather pass this in to the constructor explicitly

done, but also added this to the tracking issue for cleanup

Copy link
Contributor

@haydentherapper haydentherapper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

@@ -44,12 +46,21 @@ const (
SigstoreNoCache = "SIGSTORE_NO_CACHE"
)

// Global in-memory targets to avoid re-downloading when there is no local cache.
// TODO: Consider using this map even when local caching to avoid reading from disk
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like this idea as a future improvement! on initialize, just load in all targets into memory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly adding in #1935

pkg/cosign/tuf/client.go Show resolved Hide resolved
return nil, err
}
t.targets = newFileImpl()
t.local, err = newLocalStore()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To clarify - On the first initialization, newLocalStore() will set up a local TUF DB, but without any metadata or targets in it, correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the first place where the local is populated is with the InitLocal method, and that sets the trusted root metadata.

if root == nil {
root, err = getRoot(trustedMeta, t.embedded)
if err != nil {
t.Close()
return nil, fmt.Errorf("getting trusted root: %w", err)
}
}
if err := t.client.InitLocal(root); err != nil {
t.Close()
return nil, fmt.Errorf("unable to initialize client, local cache may be corrupt: %w", err)
}

go-tuf client calls populate the timestamp/snapshot/targets on Update()

}
if err := os.WriteFile(cachedRemote(rootCacheDir()), b, 0600); err != nil {
return fmt.Errorf("storing remote: %w", err)
if err := util.TargetFileMetaEqual(localMeta, validMeta); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To also double check - This would catch if the embedded targets are modified for some reason, correct? So to verify the e2e flow:

  • Cosign requests the target metadata (targets.json) from the remote
  • The metadata's sig is verified using the target role from the root (which has already been updated and verified)
  • For each target in targets.json, we check if the target needs to be downloaded or is present in the embedded/local cache
  • For each target, we compare its hash to the hash from the metadata (this step)

Copy link
Contributor Author

@asraa asraa May 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Mostly correct, except slight details in Target retrieval/download from

func maybeDownloadRemoteTarget(name string, meta data.TargetFileMeta, t *TUF) error {

  • Cosign requests the target metadata (targets.json) from the remote
  • The metadata's sig is verified using the target role from the root (which has already been updated and verified)
  • For each target in targets.json, we check (1) if we already have the target in our local cache, if it's hash is valid, do nothing. (2) if the target is present in the embedded/local cache, we compare its hash from the metadata and copy it to the cache for ease of retrieval or (3) it must need to be downloaded (the hash is verified in the go-tuf call) and we copy it to the cache
  • On get, we retrieve the target compare its hash to the hash from the metadata (this step)

dest := targetDestination{buf: &w}
if err := t.client.Download(name, &dest); err != nil {
return fmt.Errorf("downloading target: %w", err)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to also check isValidTarget, or will Download handle checking the target hash? (If Download doesn't, it should!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, it does check hash handling!
https://github.com/theupdateframework/go-tuf/blob/3b26aedfe985198bc88a9dda7525938c575ca046/client/client.go#L928-L934

I also put hash checking on the Get/retrieval side, just for robustness:

validMeta, err := t.client.Target(name)
if err != nil {
return nil, fmt.Errorf("error verifying local metadata; local cache may be corrupt: %w", err)
}
targetBytes, err := t.targets.Get(name)
if err != nil {
return nil, err
}
if !isValidTarget(targetBytes, validMeta) {
return nil, fmt.Errorf("cache contains invalid target; local cache may be corrupt")
}

@asraa
Copy link
Contributor Author

asraa commented May 31, 2022

cc @dlorenc I think we're all in agreement that there's follow-up here to be addressed as soon as this is merged #1935, I'll be taking care of them one by one starting this afternoon and handing off to @vaikas for more testing of the policy controller.

@dlorenc
Copy link
Member

dlorenc commented May 31, 2022

Yay! Thanks everyone!

@asraa
Copy link
Contributor Author

asraa commented May 31, 2022

Yay! Thanks everyone!

@dlorenc please can I have a write access review :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug: Investigate issue in TUF root upgrades
6 participants