Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows Pod with RunAsUserName and a Projected Volume does not honor file permissions in the volume #102849

Open
aravindhp opened this issue Jun 14, 2021 · 38 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/windows Categorizes an issue or PR as relevant to SIG Windows.

Comments

@aravindhp
Copy link
Contributor

aravindhp commented Jun 14, 2021

What happened:

When a Windows Pod is created with a Projected Volume and RunAsUserName set, file permissions are not handled in the projected volume. In addition if a Windows Pod is created with a Projected Volume and RunAsUser set,

    securityContext:
      runAsUser: 1000640000
      windowsOptions:
        runAsUserName: ContainerAdministrator

the Pod will be stuck at ContainerCreating:

E0611 17:38:46.666139    2260 atomic_writer.go:404] pod win-webserver/win-webserver-6dd6cd7d9-ng62b volume kube-api-access-62z59: unable to change file c:\var\lib\kubelet\pods\a7bddda2-a355-48cd-8ee2-71776d6de78f\volumes\kubernetes.io~projected\kube-api-access-62z59\..2021_06_11_17_38_46.035861910\token with owner 1000640000: chown c:\var\lib\kubelet\pods\a7bddda2-a355-48cd-8ee2-71776d6de78f\volumes\kubernetes.io~projected\kube-api-access-62z59\..2021_06_11_17_38_46.035861910\token: not supported by windows
E0611 17:38:46.666139    2260 atomic_writer.go:175] pod win-webserver/win-webserver-6dd6cd7d9-ng62b volume kube-api-access-62z59: error writing payload to ts data directory c:\var\lib\kubelet\pods\a7bddda2-a355-48cd-8ee2-71776d6de78f\volumes\kubernetes.io~projected\kube-api-access-62z59\..2021_06_11_17_38_46.035861910: chown c:\var\lib\kubelet\pods\a7bddda2-a355-48cd-8ee2-71776d6de78f\volumes\kubernetes.io~projected\kube-api-access-62z59\..2021_06_11_17_38_46.035861910\token: not supported by windows

What you expected to happen:

The Pod should go to Running with the project volume files having the correct container user ownership.

How to reproduce it (as minimally and precisely as possible):

Create a Windows Pod with a Projected Volume and runAsUserName set (optionally with runAsUser to cause the stuck at ContainerCreating issue).

Anything else we need to know?:

The KEP, proposal for file permission handling in projected service account volume and its implementation did not take Windows in to account. An os.Chown() was introduced which is not supported on Windows. In addition RunAsUserName being set on Windows Pods was not taken into account.

Environment:

  • Kubernetes version (use kubectl version): 1.21.1
  • Cloud provider or hardware configuration: AWS / Azure
  • OS: Windows Server 2019 1809
  • Kernel:
@aravindhp aravindhp added the kind/bug Categorizes issue or PR as related to a bug. label Jun 14, 2021
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 14, 2021
@aravindhp
Copy link
Contributor Author

/sig windows

@k8s-ci-robot k8s-ci-robot added sig/windows Categorizes an issue or PR as relevant to SIG Windows. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Jun 14, 2021
@aravindhp
Copy link
Contributor Author

/assign

@aravindhp aravindhp changed the title Windows Pods with a Projected Volume is stuck at ContainerCreating Windows Pod with a Projected Volume is stuck at ContainerCreating Jun 14, 2021
@jsturtevant
Copy link
Contributor

/triage accepted
/priority important-soon

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jun 14, 2021
@k8s-ci-robot k8s-ci-robot assigned wzshiming and unassigned wzshiming Jun 15, 2021
@aravindhp
Copy link
Contributor Author

/retitle Windows Pod with RunAsUserName and a Projected Volume does not honor file permissions in the volume

@k8s-ci-robot k8s-ci-robot changed the title Windows Pod with a Projected Volume is stuck at ContainerCreating Windows Pod with RunAsUserName and a Projected Volume does not honor file permissions in the volume Jun 15, 2021
@jsturtevant
Copy link
Contributor

/milestone v1.22

@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone Jun 30, 2021
@aravindhp
Copy link
Contributor Author

Please follow sig-windows slack thread for current state.

@aravindhp
Copy link
Contributor Author

Currently investigating what is the current state of owner ship across users in different pods and containers. Created a simple ping container image with two users:

FROM mcr.microsoft.com/windows/servercore:ltsc2019
RUN net user TestUser1 /add
RUN net user TestUser2 /add
CMD ["ping", "-t", "localhost"]

Created the following pod:

apiVersion: v1
kind: Pod
metadata:
  name: win-ping
spec:
  tolerations:
  - key: "os"
    value: "Windows"
    Effect: "NoSchedule"
  containers:
  - name: ping1
    image: quay.io/aravindh/win-ping
    imagePullPolicy: Always
    securityContext:
      windowsOptions:
        runAsUserName: TestUser1
    volumeMounts:
    - name: for-ping1
      mountPath: /projected-volume
    - name: c-volume
      mountPath: /from-host
  - name: ping2
    image: quay.io/aravindh/win-ping
    imagePullPolicy: Always
    securityContext:
      windowsOptions:
        runAsUserName: TestUser2
    volumeMounts:
    - name: for-ping2
      mountPath: /projected-volume
    - name: c-volume
      mountPath: /from-host
  nodeSelector:
    kubernetes.io/os: windows
  volumes:
  - name: for-ping1
    projected:
      sources:
      - secret:
          name: user
  - name: for-ping2
    projected:
      sources:
      - secret:
          name: pass
  - name: c-volume
    hostPath:
      path: /

I then execed into container ping1:

 kubectl exec win-ping -it powershell.exe -c ping1

and ran the following:

PS C:\> whoami
win-ping\testuser1
PS C:\> cat C:\from-host\var\lib\kubelet\pods\d10783ef-919d-46bf-ab38-2b951f063632\volumes\kubernetes.io~projected\for-ping2\password.txt 
1f2d1e2e67df 

So testuser1 is able to read a file projected for testuser2

@aravindhp
Copy link
Contributor Author

I repeated the experiment with two pods running on the same host:

apiVersion: v1
kind: Pod
metadata:
  name: win-ping1
spec:
  tolerations:
  - key: "os"
    value: "Windows"
    Effect: "NoSchedule"
  containers:
  - name: ping1
    image: quay.io/aravindh/win-ping
    imagePullPolicy: Always
    securityContext:
      windowsOptions:
        runAsUserName: TestUser1
    volumeMounts:
    - name: for-ping1
      mountPath: /projected-volume
    - name: c-volume
      mountPath: /from-host
  nodeSelector:
    kubernetes.io/os: windows
  volumes:
  - name: for-ping1
    projected:
      sources:
      - secret:
          name: user
  - name: c-volume
    hostPath:
      path: /
---
apiVersion: v1
kind: Pod
metadata:
  name: win-ping2
spec:
  tolerations:
  - key: "os"
    value: "Windows"
    Effect: "NoSchedule"
  containers:
  - name: ping2
    image: quay.io/aravindh/win-ping
    imagePullPolicy: Always
    securityContext:
      windowsOptions:
        runAsUserName: TestUser2
    volumeMounts:
    - name: for-ping2
      mountPath: /projected-volume
    - name: c-volume
      mountPath: /from-host
  nodeSelector:
    kubernetes.io/os: windows
  volumes:
  - name: for-ping2
    projected:
      sources:
      - secret:
          name: pass
  - name: c-volume
    hostPath:
      path: /

I then execed into win-ping1 pod:

 kubectl exec win-ping1 -it powershell.exe

and ran the following:

PS C:\> whoami
win-ping1\testuser1
PS C:\> cat C:\from-host\var\lib\kubelet\pods\b701defa-6138-4375-a6ec-f0c155973348\volumes\kubernetes.io~projected\for-ping2\password.txt 
1f2d1e2e67df

Again, testuser1 is able to read a file projected for testuser2

@voigt
Copy link

voigt commented Jul 12, 2021

Hey there (@immuzz, @jsturtevant, @aravindhp), Bug-Triage here 👋 ,
Given the fact that the code freeze for 1.22 already started I'd suggest moving this issue to 1.23. wdyt?

@marosset
Copy link
Contributor

@voigt - We'll move it to 1.23 thanks!

@marosset
Copy link
Contributor

/milestone v1.23

@k8s-ci-robot k8s-ci-robot removed this from the v1.22 milestone Jul 12, 2021
@jyotimahapatra
Copy link
Contributor

/milestone v1.25

@k8s-ci-robot k8s-ci-robot modified the milestones: v1.24, v1.25 Mar 21, 2022
@marosset marosset moved this from Backlog (v1.24) to Backlog (issues) in SIG-Windows Mar 25, 2022
@marosset marosset moved this from Backlog (issues) to Backlog (v1.25) in SIG-Windows May 5, 2022
@markjacksonfishing
Copy link

Hi @aravindhp . My name is Marky Jackson and I am one of the k8s 1.25 bug triage shadow assigned to track this body of work. It looks like all PR’s have been merged. Just checking in to see if this is still on track for k8s 1.25?

@jsturtevant
Copy link
Contributor

We don't have a clear long-term solution to issue. There has been some discussions to fix this at the OS layer but there are no timelines.
/milestone clear

@k8s-ci-robot k8s-ci-robot removed this from the v1.25 milestone Jun 13, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 11, 2022
@marosset
Copy link
Contributor

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 12, 2022
@marosset marosset removed this from Backlog (v1.25) in SIG-Windows Sep 21, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 11, 2022
@marosset
Copy link
Contributor

/lifecycle frozen

This is a valid issue but requires OS level changes (which have been discussed in the past) to address.

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 12, 2022
@k8s-triage-robot
Copy link

This issue is labeled with priority/important-soon but has not been updated in over 90 days, and should be re-triaged.
Important-soon issues must be staffed and worked on either currently, or very soon, ideally in time for the next release.

You can:

  • Confirm that this issue is still relevant with /triage accepted (org members only)
  • Deprioritize it with /priority important-longterm or /priority backlog
  • Close this issue with /close

For more details on the triage process, see https://www.kubernetes.dev/docs/guide/issue-triage/

/remove-triage accepted

@k8s-ci-robot k8s-ci-robot added needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. and removed triage/accepted Indicates an issue or PR is ready to be actively worked on. labels Mar 12, 2023
@MageshSrinivasulu
Copy link

MageshSrinivasulu commented Apr 25, 2023

Any Updates? Is this issue fixed?

Facing a similar issue where the consul agent daemon set supposed to run on every node is failing on the Windows 2022 node since the security context is having runAsGroup runAsUser fsGroup

@jsturtevant
Copy link
Contributor

Any Updates? Is this issue fixed?

No updates. It requires a significant change in the way the Windows Containers handles permissions and is under consideration but isn't something that we will be address in the short term.

Facing a similar issue where the consul agent daemon set supposed to run on every node is failing on the Windows 2022 node since the security context is having runAsGroup runAsUser fsGroup

You can checkout the PodOS field as this was designed to help with different fields that aren't relevant to the OS. Running the same DaemonSet configuration across nodes is something that sounds good as it reduces duplication but in practice we've found that it's best to have two separate DS specific for Windows or Linux since the configuration usually ends up being significantly different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/windows Categorizes an issue or PR as relevant to SIG Windows.
Projects
Status: Todo
Development

Successfully merging a pull request may close this issue.