Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Upgrade to containerd breaks Windows pods mounting volumes as drives #3471

Closed
mloskot opened this issue Feb 13, 2023 · 1 comment
Closed
Labels

Comments

@mloskot
Copy link

mloskot commented Feb 13, 2023

Describe the bug

Recently, after I have

  1. upgraded my cluster from Kubernetes 1.22 to 1.25
  2. and switched from Docker to containerd as CRI
  3. and switched to Windows Server 2022
  4. rebuilt my container images based on mcr.microsoft.com/windows/servercore:ltsc2022 instead of mcr.microsoft.com/windows/servercore:ltsc2019

I noticed my Windows pods can no longer mount volumes as containers-local D: or Z: drives.
My cluster PV-s are provisioned from Azure File shares and pods mount, but that should not be relevant here.

After a lengthy research, I found that the problem is most likely due to the known bug in containerd, namely mountPath behavior changed from docker to containerd on Windows #6589

One comment in that bug report suggest the bug has been fixed in containerd 1.6.6, see containerd/containerd#6589 (comment)

My current AKS is running containerd v 1.6.14, great, I should be able fine, but I'm not. Apparently, the bug is not being fixed until containerd 1.7 as per this containerd/containerd#6589 (comment)

To Reproduce

I believe it is not necessary to provide any steps here, as the containerd bug report should be sufficient confirmation, but here is a simple test deployment that is failing for me. I is based on slightly modified version of https://github.com/kubernetes-sigs/azurefile-csi-driver/tree/master/deploy/example/windows

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: test-containerd-ltsc2022
  name: test-containerd-ltsc2022
spec:
  replicas: 1
  selector:
    matchLabels:
      app: test-containerd-ltsc2022
  template:
    metadata:
      labels:
        app: test-containerd-ltsc2022
      name: test-containerd-ltsc2022
    spec:
      nodeSelector:
        kubernetes.io/os: windows
      containers:
        - name: test-servercore
          image: mcr.microsoft.com/windows/servercore:ltsc2022
          command:
          - "powershell.exe"
          - "-Command"
          - "while (1) { Write-Host $(Get-Date -Format u); Add-Content -Encoding Ascii D:\\data.txt $(Get-Date -Format u); sleep 5 }"
          volumeMounts:
            - name: pv-d
              mountPath: "D:"
      volumes:
        - name: pv-d
          persistentVolumeClaim:
            claimName: test-containerd-ltsc2022-pvc-d
---
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: test-containerd-ltsc2022-pvc-d
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 2Gi
  storageClassName: azurefile-csi

Expected behavior

After switching to containerd CRI, I expect to be still able to mount volumes as non-C: drives on my Windows pods.

Environment (please complete the following information):

  • Kubernetes version: 1.25.5
  • containerd version: 1.6.14

Additional context

There seem to be number of issues about this problem, but none reported directly here, to AKS repository - how on earth this cotainerd bug has slipped through AKS QA? IMHO, this makes containerd not ready to force it as default on latest AKS, does it?

UPDATE: A workaround synthesized based on all those issue reports seems to be:

  1. Mount volumes in paths rooted on C: drive, e.g. C:\D
  2. Use subst to associate those paths with drive letters, i.e. subst D: C:\D
  3. Alternatively, re-configure your apps to not rely on locations from different drives, but C: only.
@mloskot mloskot added the bug label Feb 13, 2023
@mloskot
Copy link
Author

mloskot commented Feb 13, 2023

There has been an important update in my containerd/containerd#6589 (comment)

It turns out the containerd trips over use of VOLUME directive used in Dockerfile of my custom images.
This was not caused any problems on AKS cluster based on Docker-based CRI:

# Document volume mount points typically mounted from Azure file shares.
VOLUME ["D:", "Z:"]

UPDATE: I submitted a bug report about it to conainerd here containerd/containerd#8171 and I'm closing this issue here. My apologies for the noise, but it took me rookie a while to track the problem down.

@mloskot mloskot closed this as completed Feb 27, 2023
@Azure Azure locked as resolved and limited conversation to collaborators Mar 30, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant