Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Create an “incomplete” layer record prior to invoking the graph driver #1139

Closed
mtrmac opened this issue Feb 17, 2022 · 3 comments
Closed

Comments

@mtrmac
Copy link
Collaborator

mtrmac commented Feb 17, 2022

Currently, layerStore.Put first invokes the graph driver to create a layer, and only afterwards it creates a layers.json entry (possibly an “incomplete” one if the layer is created from a tarball).

So, crashing/aborting layerStore.Put before a layers.json entry is committed to disk can result in a (possibly incomplete) graph driver layer object that we don’t have any record of, and we might well tell the graph driver to create a layer with the same ID (which can then result in unexpected situations).

With Kubernetes-colored glasses on, it seems elegant to me to have a “single source of truth”, so that if layers.json and the graph driver state get out of sync, we only ever do corrections in one direction. Considering the existing incompleteFlag logic that treats layers.json as authoritative (and also the fact that layers.json writing is taking more care to be atomic than the overlay graph driver, at least), it seems natural to do the same in layerStore.Put. so:

Proposal: In layerStore.Put, always create a Layer object with incompleteFlag, save it to layers.json, and only afterwards invoke the graph driver to create the layer.

This would be costless for layers created from a tarball (we would only change the timing of the save of the incomplete data), and add one more layerStore.Save (and a fdatasync) for read/write layers.

If creating the layer is aborted for any reason, the incompleteFlag set would cause the layer to be cleaned up in the future (but note #1136 ).

Driver.Remove implementations would still need to be fairly resilient against incomplete/corrupt layers, but that’s, to an extent, already the case today.


Alternatively, we could require Driver.Create* to be idempotent / resilient against pre-existing/partial layers with the same ID.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Feb 17, 2022

Alternatively, we could require Driver.Create* to be idempotent / resilient against pre-existing/partial layers with the same ID.

#1140 is a step in that direction.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Feb 22, 2022

… and #1148 implements the suggestion from this issue.

I tend to think that that #1140 is a short-term backportable workaround, and #1148 is closer to a long-term fix, but I really don’t know what I don’t know, so I’d love the experts to review and decide.

@mtrmac
Copy link
Collaborator Author

mtrmac commented May 2, 2022

#1148 was merged.

@mtrmac mtrmac closed this as completed May 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant