Large number of FS mounts left behind (approx 49k) #6851
Replies: 4 comments 9 replies
-
Overly mounts are managed by the baggageclaim component on the worker: https://github.com/concourse/baggageclaim/ Everything under |
Beta Was this translation helpful? Give feedback.
-
I checked multiple volumes from the disconnected node: (Mar 18, Mar 27th and Apr 6th) and none of them exist in concourse db. But the 49k+ mounts are still existing on that node. I will check the volumes for other live worker nodes once it increases and see what happen to them. Also, do "fly -t t1 volumes" and "fly -t t2 volumes" return the list of volumes associated with the specified target only or all the volumes? I am getting different results with different targets: for one I am getting about 2.5k volumes and for another about 2.9k volumes. And when I diff the two outputs, it has about 1.8k diff lines. So it seems some are common volumes reported and some are different. Note that I have about 47 pipelines configured on this instance by various team members, and when I look at the main concourse dashboard, none of them seem to be running (any boxes blinking). Perhaps some are doing lot of periodic checks. I will focus on debugging ones created by me to see if I can find the root cause. |
Beta Was this translation helpful? Give feedback.
-
I was looking at one of the worker nodes which is not connected to concourse web node. It has about 49k live mounts and if I try to re-connect as worker node, it fails with "out of disk-space" error message. But when I check the various partitions, they all seem to have plenty of free space. See below: [ scripts]$ sudo concourse worker 1> logs/run-concourse-worker-sjc-devx-u3021.log 2> logs/run-concourse-worker-sjc-devx-u3021.err [ scripts]$ cat logs/run-concourse-worker-sjc-devx-u3021.log Any hints on how to find where it is having diskspace issues. |
Beta Was this translation helpful? Give feedback.
-
I have a similar problem:
Worker does not start anymore:
Running Concourse CI v7.11.0 with This seems to make Docker commands very slow as they have to go through all these overlays. Also |
Beta Was this translation helpful? Give feedback.
-
I am using concourse 7.1.0 on RHEL8. It seems that there are large number (~49k) of overlay FS mounts that are created on the worker node and NOT cleaned up. Apparently, this causes the linux box to hang and a reboot is required to make it usable. I have enabled use of "containerd".
I am sure I must have mis-configured something to cause this. How can I go about debugging this?
Example:
Beta Was this translation helpful? Give feedback.
All reactions