Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blockstore: OpenReadWrite should use atomic writes #249

Open
mvdan opened this issue Sep 30, 2021 · 1 comment
Open

blockstore: OpenReadWrite should use atomic writes #249

mvdan opened this issue Sep 30, 2021 · 1 comment
Labels
help wanted Seeking public contribution on this issue P3 Low: Not priority right now

Comments

@mvdan
Copy link
Contributor

mvdan commented Sep 30, 2021

Right now, OpenReadWrite writes directly to the destination file. This has multiple problems:

  • If we encounter some error mid-finalize, we may leave a corrupted file
  • If we encounter a write/flush error while finalizing, we may leave a partial file
  • If we're resuming on an existing file and writing in-place, we may corrupt the user's data (e.g. Opening car blockstore with different header damages car #247)

It would be much saner overall to instead write to a temporary copy next to the destination file (such as foo.car.tmp for foo.car), and once Finalize has finished with no error, we do a rename that should basically never fail.

This also means that we can teach OpenReadWrite to always remove foo.car.tmp when it's discarded, to ensure we don't leave unfinished business behind.

The method described above should be doable with just a bit of os and io glue. We could use https://pkg.go.dev/github.com/google/renameio#TempFile, though I think pulling in a library for this use case is perhaps overkill. We want to prevent common errors, and we're not that concerned with atomicity between multiple processes - users should not be writing to the same CAR file concurrently.

cc @masih @willscott

@masih
Copy link
Member

masih commented Sep 30, 2021

So would foo.car.tmp be a copy of foo.car while it's being written to and not finalised or discarded?

If so, I wonder what the implications would be when working with large CAR files; this would double the storage demand I think.

@BigLep BigLep added P3 Low: Not priority right now help wanted Seeking public contribution on this issue labels Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Seeking public contribution on this issue P3 Low: Not priority right now
Projects
None yet
Development

No branches or pull requests

3 participants