Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mountPath behavior changed from docker to containerd on Windows #6589

Closed
andyzhangx opened this issue Feb 25, 2022 · 18 comments
Closed

mountPath behavior changed from docker to containerd on Windows #6589

andyzhangx opened this issue Feb 25, 2022 · 18 comments
Labels
area/cri Container Runtime Interface (CRI) kind/bug platform/windows Windows

Comments

@andyzhangx
Copy link

Description

now the following mountPath behavior changed from docker to containerd on Windows:

          volumeMounts:
          - mountPath: 'D:'
            name: azure
  • docker on Windows

mount as a D drive

  • containerd on Windows

mount as c:\D\ folder

question is how can I mount as a new drive with containerd on Windows? is this a regression?

Steps to reproduce the issue

  1. n/a

Describe the results you received and expected

I could still mount hostPath as a new drive with containerd on Windows

What version of containerd are you using?

containerd://1.5.5+azure

Any other relevant information

No response

Show configuration if it is related to CRI plugin.

n/a

@andyzhangx
Copy link
Author

cc @kevpar

@kevpar
Copy link
Member

kevpar commented Feb 25, 2022

@ambarve was looking into this.

@andyzhangx
Copy link
Author

any progress here?

@TBBle
Copy link
Contributor

TBBle commented Mar 17, 2022

Looks like #6651 is the solution, and is progressing through review. The tests at https://github.com/containerd/containerd/pull/6651/files?diff=unified&w=0#diff-908e014f804871a6a2f00b97c0f88f378cfba56538f24cf0b8da4040d0d797e2 appear to cover this use-case.

@mschumacher-syntellis
Copy link

Any status updates on this?

@TBBle
Copy link
Contributor

TBBle commented May 3, 2022

#6651 has merged into containerd and will be part of containerd 1.7. I guess that PR was supposed to close this ticket, but perhaps "Fixes" doesn't work in comments.

@gianlucalucente-trn
Copy link

Hello,
do you know when this bugfix will be included in the official released version?
Thank you

@andyzhangx
Copy link
Author

the fix is also in Windows containerd v1.6.6

@fuweid fuweid added the area/cri Container Runtime Interface (CRI) label Jun 25, 2022
@mloskot
Copy link

mloskot commented Feb 12, 2023

@andyzhangx You say it's fixed in 1.6.6 and @TBBle in #6589 (comment) says it will be fixed in 1.7. Am I getting this right that we should actually wait for 1.7?

AFAICT, containerd v 1.6.14 is still not fixed and can not mount D: or Z: drives for me on AKS. Everything worked well with AKS on Kubernetes 1.22 (now EOL'ed) but it's been failing apart since I upgraded to 1.25.5. Unfortunately, there is no follow-up from @rnemeth90 to his kubernetes-sigs/azurefile-csi-driver#923 if this issue has been resolved for him.

@rnemeth90
Copy link

@mloskot If this was a regression, it was never resolved to my knowledge. We implemented a work-around by adjusting our applications to use the new path ( c:\d, etc. )

@mloskot
Copy link

mloskot commented Feb 12, 2023

@rnemeth90 Well, from the higher level of AKS standpoint, it is a regression to me. I've upgraded my cluster and it's stopped working, and I'm locked as I can not go back, because Docker CRI is no longer supported, regardless if nodes are based on Windows Server 2019 or 2022.
Apologies for wandering off-topic here, but it's astonishing to me how such basic functionality could have slipped through AKS QA sieves.

Mounting volumes as non-C drives is an ubiquitous convention on containers I have to manage, so this issue turns a serious inconvenience.

As for the workaround, do you mean you've patched your containers to do something like subst D: C:\d, e.g. during entrypoint?
I'd be grateful for more details.

@TBBle
Copy link
Contributor

TBBle commented Feb 13, 2023

My comment predated the backport of that fix into containerd 1.6 (#6929), and it is listed in the 1.6.5 changelog: https://github.com/containerd/containerd/releases/tag/v1.6.5.

The backport included the tests I mentioned, and they're still there in the 1.6 release branch, so either I was incorrect in thinking those tests covered this problem, or something else is going wrong at a different level, perhaps in hcsshim or the HCS itself.

Is it possible to capture the hcsshim::CreateComputeSystem request in your setup? That should contain a MappedDirectories key which indicates what mapping request was sent down to the HCS by hcsshim, and will help narrow down the kind of bug happening here.

@mloskot
Copy link

mloskot commented Feb 13, 2023

@TBBle

My comment predated the backport of that fix into containerd 1.6

I see, thanks for the confirmation. Great news!

either I was incorrect in thinking those tests covered this problem, or something else is going wrong at a different level, perhaps in hcsshim or the HCS itself.

In mounting-tests.zip I prepared two kubectl-based using vanilla mcr.microsoft.com/windows/servercore:ltsc2022 images tests which cover mounting in folder on C: and mounting as D:, and as you can see in the README.md there, they both succeeded on my AKS cluster (Kubernetes 1.25.5, Windows Server 2022).

Those results puzzled me as why on earth mounting on D: is failing for my custom images based on mcr.microsoft.com/windows/servercore:ltsc2022? I moved to your suggestion on "something else is going wrong" and I found out that Dockerfile-s of my custom images contain this seemingly benign declaration:

# Document volume mount points typically mounted from Azure file shares.
VOLUME ["D:", "Z:"]

Now, after removing this VOLUME line and rebuilding my images, everything seems to be getting back to order!

Is it possible to capture the hcsshim::CreateComputeSystem request in your setup?

Well, I'm not sure how to do it (too much of a rookie here), but I looked at Kubernetes events generated for my images using the troublesome VOLUME ["D:", "Z:"] and here is the sample which suggests the containerd is tripping over the Dockerfile directive VOLUME:

Error: failed to create containerd container: rootpath on mountPath C:\Windows\TEMP\ctd-volume3950197485\377, volume Z:: CreateFile C:\Windows\TEMP\ctd-volume3950197485\377\Z:: The filename, directory name, or volume label syntax is incorrect.
Error: failed to create containerd container: rootpath on mountPath C:\Windows\TEMP\ctd-volume427796215\376, volume D:: CreateFile C:\Windows\TEMP\ctd-volume427796215\376\D:: The filename, directory name, or volume label syntax is incorrect.
Error: failed to create containerd container: rootpath on mountPath C:\Windows\TEMP\ctd-volume2161075262\375, volume Z:: CreateFile C:\Windows\TEMP\ctd-volume2161075262\375\Z:: The filename, directory name, or volume label syntax is incorrect.
Error: failed to create containerd container: rootpath on mountPath C:\Windows\TEMP\ctd-volume554088254\374, volume Z:: CreateFile C:\Windows\TEMP\ctd-volume554088254\374\Z:: The filename, directory name, or volume label syntax is incorrect.

The VOLUME ["D:", "Z:"] used to work or rather had not caused any trouble on my previous AKS based on Docker CRI.

Was that behaviour on my previous AKS anything Docker-specific and my use of VOLUME directive was erroneous or is this something containerd is lacking about?

However, it looks like it's turned out an embarrassing way as the issue is with my custom images. So, I'd like to sincerely apologies for my earlier rant in #6589 (comment)

@TBBle
Copy link
Contributor

TBBle commented Feb 14, 2023

Interesting. That might be a separate containerd bug to file then, as by my reading of the docs your usage there looks valid.

Tracking down the error message, rootpath on mountPath C:\Windows\TEMP\ctd-volume3950197485\377, volume Z: is coming from thie code block, and I'm pretty sure that the bug lives here. fs.rootPath is going to try to return C:\Windows\TEMP\ctd-volume3950197485\377 + \ + Z:, but that's not a valid path on Windows. The error we're seeing is part of fs.rootPath's symlink-protection checks trying to open that file, but otherwise it would have failed with stat volume in rootfs when that code block tried to check if the target existed in order to copy stuff from the relevant path in the rootfs into the new volume mount, in case no mount request for this path was provided in the container-launch data.

I suspect this code block should simply skip drive-letter volumes other than C:, since I don't believe (but could be wrong) that you can store non-C: volumes in a WCOW container image; the image filesystem (rootfs here) is assumed to be mounted at C:, other drive letters are runtime-only. (I also suspect that VOLUME ["D:", "Z:"] also won't guarantee those drive letters exist, as they would for non-drive-letter paths, but I haven't checked that, don't know if Docker does the same thing, and that would be a different bug (or feature request...) either way. #5671 went more into this area.)

I suspect the repro commands from #5671 or similar, but with a trivial Dockerfile with a VOLUME ["D:", "Z:"] line (which doesn't apply to Linux) would reproduce this issue with only containerd and crictl, if it helps, since that avoids the need for k8s in the repro.

To be clear, that was mostly guess-work based on looking at just this block of code, and it's possible I'm misreading it. We'll see what happens in the relevant bug-report discussion.

@mloskot
Copy link

mloskot commented Feb 14, 2023

@TBBle Thank you for looking into that. I wish I could help more, but the only way I can think of is testing any fixes happening in future. I will watch that #5671 space.

@TBBle
Copy link
Contributor

TBBle commented Feb 15, 2023

#5671 is an already-resolved issue. If you don't create a new issue for what you're seeing with VOLUME ["D:", "Z:"], it won't get addressed. No one else seems to have logged it nor fixed it in the last few years that the bug has existed, so we could be waiting years more for a fix to appear organically, or for someone else to have the same problem, correctly identify the relevant factor (as you did above) and log a bug for it so it's visible to people who aren't watching specific resolved issues. ^_^

@mloskot
Copy link

mloskot commented Feb 27, 2023

If you don't create a new issue for what you're seeing with VOLUME ["D:", "Z:"], it won't get addressed.

@TBBle FYA, I have opened a new one here #8171 and I hope I've provided all the relevant information. Thank you for your support.

@andyzhangx
Copy link
Author

andyzhangx commented May 18, 2023

FYI, I have verified that the original issue has been fixed in containerd 1.6.14, mountPath: 'D:' works now.

# k get no -o wide
NAME                                STATUS   ROLES   AGE     VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE                         KERNEL-VERSION      CONTAINER-RUNTIME
akswin000000                        Ready    agent   4m42s   v1.26.3   10.224.0.33    <none>        Windows Server 2022 Datacenter   10.0.20348.1726     containerd://1.6.14+azure
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: statefulset-azurefile-win
  labels:
    app: busybox
spec:
  serviceName: statefulset-azurefile-win
  replicas: 1
  template:
    metadata:
      labels:
        app: busybox
    spec:
      nodeSelector:
        "kubernetes.io/os": windows
      containers:
        - name: statefulset-azurefile
          image: mcr.microsoft.com/windows/servercore:ltsc2022
          command:
            - "powershell.exe"
            - "-Command"
            - "while (1) { Add-Content -Encoding Ascii D:\\data.txt $(Get-Date -Format u); sleep 1 }"
          volumeMounts:
            - name: persistent-storage
              mountPath: 'D:'
  updateStrategy:
    type: RollingUpdate
  selector:
    matchLabels:
      app: busybox
  volumeClaimTemplates:
    - metadata:
        name: persistent-storage
      spec:
        storageClassName: azurefile-csi
        accessModes: ["ReadWriteMany"]
        resources:
          requests:
            storage: 100Gi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/cri Container Runtime Interface (CRI) kind/bug platform/windows Windows
Projects
None yet
Development

No branches or pull requests

9 participants