Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Confusing features, and locking semantics, of Mount/Unmount #1379

Open
mtrmac opened this issue Oct 8, 2022 · 5 comments
Open

Confusing features, and locking semantics, of Mount/Unmount #1379

mtrmac opened this issue Oct 8, 2022 · 5 comments

Comments

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 8, 2022

Currently, both store.Mount and store.Unmount only work with the primary layer store, and lock it for writing:

storage/store.go

Line 2745 in 7de6744

rlstore.Lock()

storage/store.go

Line 2850 in 7de6744

rlstore.Lock()

as of e84b264 .

Meanwhile, Diff reads all layer stores, and locks for reading:

storage/store.go

Line 2940 in 7de6744

store.RLock()
as of 68d6510 .

But the two are related: Diff can trigger mounts/unmounts via

storage/layers.go

Line 1474 in 7de6744

path, err := r.Mount(id, drivers.MountOpts{})
(although we don’t usually see that, because several graph drivers, notably overlay, implement DiffGetter


Note that Diff is, AFAICS, the only way to possibly trigger a mount/unmount from an additional layer store.

Meanwhile, Mount and Unmount nowadays reject attempts to modify them from read-only (= additional) stores:

if !r.IsReadWrite() && !hasReadOnlyOpt(options.Options) {

storage/layers.go

Line 1000 in 7de6744

if !r.IsReadWrite() {
as of 0183a29 .

More curiously, Mount has an exception for read-only stores as of 0ef62ee , but

  • Unmount doesn’t have a corresponding exception
  • The only caller that could possibly benefit, Diff, does not set the “read-only” option, so it doesn’t fit that exception at all.

Questions:

  • Does mounting require the layer store to be locked read-only or read-write? Either Diff is incorrect, or Mount/Unmount is locking excessively. At least 0183a29 suggests that the intent was to make read-only layer store locks to be sufficient.
  • Can the “allow read-only mounts from read-only stores” exception be reached in the current codebase? And if not, is it desirable for it to be reachable?

@nalind @vrothberg @rhatdan ?

@mtrmac
Copy link
Collaborator Author

mtrmac commented Oct 14, 2022

Looking into this further, any mount/unmount call can modify layerStore-owned Layer.MountCount, which requires (per #1332 ) an in-process write lock against other readers. So at least the fully read-only access of Diff is insufficient (but it might be plausible to only hold an in-process exclusive lock, while keeping the filesystem locked for reading only).

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 12, 2022

Note, to self and others: Carefully distinguish

  • read-only vs. read-write stores:lockfile.IsReadWrite is true for the primary layer store, false for additional stores. This is a constant (for the lifetime of the process, because the bit is stored in the per-process lock file… so at least for the lifetime of the part of the process that uses the same c/storage vendored codebase)
  • read-only vs. read-write cross-process access: Whether the lock file is locked for reading or writing (always read-only for additional stores). Or perhaps whether the mount lock file is locked for reading or writing.
  • read-only vs. read-write in-process access, governing data races when accessing memory.

It’s just way too much to keep track of, especially without explicit documentation and language help (#1389 ).

mtrmac added a commit to mtrmac/storage that referenced this issue Nov 14, 2022
This should be fixed, it just seems too hard to do without
breaking API (and performance).

So, just be clear about that to warn future readers.

It's tracked in containers#1379 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 14, 2022

any mount/unmount call can modify layerStore-owned Layer.MountCount, which requires (per #1332 ) an in-process write lock against other readers.

#1346 adds an in-process lock to synchronize these accesses (and eliminates some of the other accesses); but that doesn’t fix the Diff paths, which are insufficiently synchronized.

(Nothing at all changes about the unclear ability to mount layers from additional stores.)

@rhatdan
Copy link
Member

rhatdan commented Nov 15, 2022

Podman mount and unmount are attempting to count how many times an image is mounted, thus the count.
If I run

# mnt=$(podman mount alpine)
# podman run -d alpine sleep 200

This should end up with 2 mounts of the image.
When the sleep completes it will go to 1.
and if I do

# podman unmount alpine

Down to zero. And the overlay mount can actually be unmounted.
This count can get out of whack if apps crash or the system crashes without decrementing the mount count to 0. (Unless the mount count is done on a tmpfs?)

I don't believe this is a solved problem, and I think we have some flakes in podman CI system, caused by miscounting the mounts.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Nov 15, 2022

I’m not at all arguing against the existence of the mount count (and yes, it is stored in the “RunDir”).

This issue primarily about the ability / inability / correctness of mounting layers from additional stores, and the correctness of the concurrency implementation.

mtrmac added a commit to mtrmac/storage that referenced this issue Nov 21, 2022
This should be fixed, it just seems too hard to do without
breaking API (and performance).

So, just be clear about that to warn future readers.

It's tracked in containers#1379 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/storage that referenced this issue Nov 22, 2022
This should be fixed, it just seems too hard to do without
breaking API (and performance).

So, just be clear about that to warn future readers.

It's tracked in containers#1379 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/storage that referenced this issue Dec 6, 2022
This should be fixed, it just seems too hard to do without
breaking API (and performance).

So, just be clear about that to warn future readers.

It's tracked in containers#1379 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/storage that referenced this issue Dec 12, 2022
This should be fixed, it just seems too hard to do without
breaking API (and performance).

So, just be clear about that to warn future readers.

It's tracked in containers#1379 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
mtrmac added a commit to mtrmac/storage that referenced this issue Jan 3, 2023
This should be fixed, it just seems too hard to do without
breaking API (and performance).

So, just be clear about that to warn future readers.

It's tracked in containers#1379 .

Signed-off-by: Miloslav Trmač <mitr@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants