Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rootless podman fails with: {"cause":"operation not permitted","message":"lsetxattr /opt/nomad/data/alloc/5db64440-ad18-56f7-0cc6-608d2a2b3ccf/alloc: operation not permitted","response":500} #232

Closed
erikschul opened this issue Apr 4, 2023 · 3 comments

Comments

@erikschul
Copy link

erikschul commented Apr 4, 2023

My VM has the following setup:

  • I've followed 10 different guides in aggregate, to try to find the root cause of this issue.
  • AlmaLinux 9
  • Podman 4.2.0
  • nomad client runs as root
  • nomad user owns /opt/nomad/ recursively
  • setenforce 0 has been tested
  • nomad-podman-driver points to nomad user's socket (verified to work correctly) and uses selinuxlabel = "z"
  • podman run as user nomad works fine

When scheduling a basic demo job, it fails with the message:

client.alloc_runner.task_runner: Task event: alloc_id=5db64440-ad18-56f7-0cc6-608d2a2b3ccf task=bannertask type="Driver Failure"
rpc error: code = Unknown desc = failed to start task, could not start container: cannot start container, status code: 500: {"cause":"operation not permitted","message":"lsetxattr /opt/nomad/data/alloc/5db64440-ad18-56f7-0cc6-608d2a2b3ccf/alloc: operation not permitted","response":500}

When running ls -l /opt/nomad/data/alloc/, it shows that:

drwx------. 4 nomad nomad 37 Apr  4 23:04 12301492-2b45-7c39-5db6-66909cc72bfc
drwxr-xr-x. 4 nomad nomad 37 Apr  5 00:01 34f03670-e80d-26cc-f7c7-484c26eb5eb3
drwxr-xr-x. 4 root  root  37 Apr  5 00:36 5db64440-ad18-56f7-0cc6-608d2a2b3ccf
drwxr-xr-x. 4 nomad nomad 37 Apr  4 23:58 91cc6d6b-3569-b133-bc4b-1fd007d46ce3
drwx------. 4 nomad nomad 37 Apr  4 23:47 aa09251e-0650-1bf7-cb56-4af805e327ce

Perhaps the problem is that the Nomad client runs as root, and creates the folder in alloc, which nomad user doesn't have privileges in?

I haven't explicitly configured fuse-overlayfs or crun or container_manage_cgroup. Could that be the cause?

Possibly related issues:

@erikschul
Copy link
Author

If I remove selinuxlabel, I get this error instead:

| rpc error: code = Unknown desc = failed to start task, could not start container: cannot start container, status code: 500: {"cause":"broken pipe","message":"write child: broken pipe","response":500}

@erikschul
Copy link
Author

erikschul commented Apr 5, 2023

Is it possible that the problem is, that nomad-driver-podman creates the /opt/nomad/data/alloc/c130a67b-4ff5-4ef7-9317-d57ecb5d37f8 directory as root:root (and drwxr-xr-x), when it should be created as nomad:nomad? (the user should obviously be configurable), which prevents podman from running lsetxattr?

I guess this isn't supported?
#84

If that's the case (since 2021?), perhaps it could be made more clear in the README that rootless requires the nomad client to also be run as the same user? (which then causes other problems relating to volume mounts and network configuration)

@erikschul
Copy link
Author

It works when the nomad client service is run as nomad, and as expected, the folders in /opt/nomad/data/alloc/ have nomad:nomad ownership.

But is the bug with nomad or nomad-driver-podman? I assume nomad is responsible for creating the folder in alloc?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant