Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiencing permission denied after v3.1.0 on jobContainers when container user is not 'root' #956

Open
fhammerl opened this issue Oct 11, 2022 · 10 comments

Comments

@fhammerl
Copy link
Contributor

After the rollout actions/checkout v3.1.0, some reported permission denied errors when using the action on a jobContainer running with a container user that is not 'root'. With v3.1.0, we're using commands that write to files to replace deprecated stdout commands.

These are the files in question causing permission issues due to mismatched UIDs between user running the runner process and the user on the container it starts.

Running a container using a container user that is not 'root' is an unsupported path, we've listed some workarounds below.

Summary

In short: the jobContainers in these workflows seem to follow an unsupported path. Namely,

Do not use the USER instruction in your Dockerfile, because you won't be able to access the GITHUB_WORKSPACE

I believe the jobContainer operates as the user 'circleci' in the workflow you linked.

We will roll out the changes again as we haven't seen issues with the officially supported paths.

It is recommended 'not use the USER instruction in your Dockerfile'.
In most cases, this means 'run the container as root'.

I understand this introduces some friction, so see below for workarounds and a technical summary.

Detailed Summary

  1. In actions/checkout@3.1.0, we began phasing out the use of some workflow commands that write to stdout with functions that write to temp files instead.
  2. If you're running in a jobContainer, these temp files are created by the runner app on your host OS and are owned by the user who started the runner (in case of ubuntu-latest, this user is 'runner' with UID-1001). These files are then mounted to your jobContainer.
  3. actions/checkout is a .js file executed using node on your jobContainer. As per the changes in 1., actions/checkout@v3.1.0 attempts to write to these temp files on your jobContainer.

This can have a couple of outcomes:

  1. If the jobContainer user is 'root': writes successfully.
  2. If the jobContainer user is NOT 'root':
    a) If the UID of the jobContainer user == UID of host runner user (UID-1001 on ubuntu-latest): write is successful
    b) If the UID of the jobContainer user != UID of host runner user (UID-1001 on ubuntu-latest): permission denied

The jobContainers in the above workflow seem to run into 2/b.

Fix

As per the docs, don't modify the default user in the Dockerfile. If that's not feasible for you: 👇

Workarounds (unsupported)

  • Use actions/checkout@v3.0.2 in your workflow (will be deprecated)
  • Match the UID of the host runner user (e.g. UID-1001 on ubuntu-latest today) to the UID of the user on the jobContainer
  • Override the default container user and use 'root':
container: 
    image: alpine:latest
    options: --user root

Error logs

##[debug]Running JavaScript Action with default external tool: node16
node:internal/fs/utils:344
    throw err;
    ^

Error: EACCES: permission denied, open '/__w/_temp/_runner_file_commands/save_state_2[7]
@rGaillard
Copy link

We experience the same issue with the v2.5.0 version, our CI is broken, had to rollback :(

@priv-kweihmann
Copy link

The same happens with latest release on actions/cache - the common denominator seems to the usage of actions/core 1.10.0.
All previous versions that do not use this particular core version are fine

@chenrui333
Copy link

Anyone trying to fix this?

dham added a commit to firedrakeproject/firedrake that referenced this issue Oct 23, 2022
Work around the issue raised in: actions/checkout#956 by setting the container user to root. Note that we don't want to do this more generally due to the configuration we use on the self-hosted builders.
@Bo98
Copy link

Bo98 commented Dec 3, 2022

Just brainstorming a bit here: would there be security implications if the runner were to set o+w for file commands? I guess it maybe could for some specific multi-user self-hosted scenarios, but for GitHub-hosted runners and self-hosted runners that opt-in/out: it could make sense and seems like something that could ease concerns here a bit.

Match the UID of the host runner user (e.g. UID-1001 on ubuntu-latest today) to the UID of the user on the jobContainer

I guess the core problem is Docker doesn't really support UID remaps at runtime itself.

Nevertheless, this is a solution I was initially going to chase but the problem is that any mechanism to do so dynamically is disabled in GitHub Actions. While you could rely on a fixed 1001 for GitHub-hosted, it gets trickier with varying self-hosted setups. Initial idea I had was to create an entrypoint that would get the UID of RUNNER_WORKSPACE and use that to change the container user, but unfortuately entrypoints are force disabled unless container hooks are enabled (not entirely sure why), which obviously rules out it working on GitHub-hosted.

@romulus-ai
Copy link

Does anyone else have the same security concerns than me if I am forced to runs all Pipeline Stuff and Action under root in a container? I mean, it is a much higher security risk than using an unprivileged user which is just able to execute a bunch of commands it needs.

@paneq
Copy link

paneq commented Jun 20, 2023

Just brainstorming a bit here: would there be security implications if the runner were to set o+w for file commands

I like this idea from @Bo98

Match the UID of the host runner user (e.g. UID-1001 on ubuntu-latest today) to the UID of the user on the jobContainer

@fhammerl Is this guaranteed to be a long-term contract that we could rely on in the future with subsequent actions upgrades? Or just one time working working workaround that's still unsupported path?

Can you comment on the technical limitations for supporting container users?

@AngryMaciek
Copy link

Is this issue progressing?
I have the same problem as mentioned in #1014 : my test suite requires some tests to be triggered by non-root user; on a self-hosted runner when the workaround is to set user as root in the container they will not work...

@yeikel
Copy link

yeikel commented Aug 30, 2023

The user id changed between 2.294.0 and 2.303.0 for some reason

In 2.294.0 the user id is 1000 but in 2.303.0 it is 1001

417-72KI added a commit to 417-72KI/MockUserDefaults that referenced this issue Sep 16, 2023
pexcn added a commit to pexcn/docker-images that referenced this issue Oct 12, 2023
yorickpeterse added a commit to inko-lang/aur that referenced this issue Oct 30, 2023
Per actions/checkout#956 it seems GitHub
Actions straight up doesn't support running containers as non-root, so
maybe this will get things going.
sethrj added a commit to sethrj/celeritas that referenced this issue Nov 18, 2023
see actions/checkout#956

```
Post job cleanup.
/usr/bin/docker exec  8a93ca44215e080743610d80ccea33c624f5d3fce1aa4038cd018c06b13a75a6 sh -c "cat /etc/*release | grep ^ID"
node:internal/fs/utils:347
    throw err;
    ^

Error: EACCES: permission denied, open '/__w/_temp/_runner_file_commands/save_state_56877b46-0efb-448b-8122-6d4d35217f1e'
    at Object.openSync (node:fs:590:3)
    at Object.writeFileSync (node:fs:2202:35)
    at Object.appendFileSync (node:fs:2264:6)
    at Object.issueFileCommand (/__w/_actions/actions/checkout/v3/dist/index.js:2950:8)
    at Object.saveState (/__w/_actions/actions/checkout/v3/dist/index.js:2867:31)
    at Object.8647 (/__w/_actions/actions/checkout/v3/dist/index.js:2326:10)
    at __nccwpck_require__ (/__w/_actions/actions/checkout/v3/dist/index.js:18256:43)
    at Object.2565 (/__w/_actions/actions/checkout/v3/dist/index.js:146:34)
    at __nccwpck_require__ (/__w/_actions/actions/checkout/v3/dist/index.js:18256:43)
    at Object.9210 (/__w/_actions/actions/checkout/v3/dist/index.js:1141:36) {
  errno: -13,
  syscall: 'open',
  code: 'EACCES',
  path: '/__w/_temp/_runner_file_commands/save_state_56877b46-0efb-448b-8122-6d4d35217f1e'
}
```
github-merge-queue bot pushed a commit to celeritas-project/celeritas that referenced this issue Nov 20, 2023
* Don't default CMake presets to "debug"

* Rename CI presets

* Use CMake BUILD_TESTING to decide to add testing tree

This allows more granularity for building unit tests or just testing the apps

* First iteration of GHA CI

* Integrate style workflow

* Try again

* Fix things, thanks vscode

* Fix exclusion

* Disable clang-format, try different docker url

* REVERTME: single job

* Run as root

see actions/checkout#956

```
Post job cleanup.
/usr/bin/docker exec  8a93ca44215e080743610d80ccea33c624f5d3fce1aa4038cd018c06b13a75a6 sh -c "cat /etc/*release | grep ^ID"
node:internal/fs/utils:347
    throw err;
    ^

Error: EACCES: permission denied, open '/__w/_temp/_runner_file_commands/save_state_56877b46-0efb-448b-8122-6d4d35217f1e'
    at Object.openSync (node:fs:590:3)
    at Object.writeFileSync (node:fs:2202:35)
    at Object.appendFileSync (node:fs:2264:6)
    at Object.issueFileCommand (/__w/_actions/actions/checkout/v3/dist/index.js:2950:8)
    at Object.saveState (/__w/_actions/actions/checkout/v3/dist/index.js:2867:31)
    at Object.8647 (/__w/_actions/actions/checkout/v3/dist/index.js:2326:10)
    at __nccwpck_require__ (/__w/_actions/actions/checkout/v3/dist/index.js:18256:43)
    at Object.2565 (/__w/_actions/actions/checkout/v3/dist/index.js:146:34)
    at __nccwpck_require__ (/__w/_actions/actions/checkout/v3/dist/index.js:18256:43)
    at Object.9210 (/__w/_actions/actions/checkout/v3/dist/index.js:1141:36) {
  errno: -13,
  syscall: 'open',
  code: 'EACCES',
  path: '/__w/_temp/_runner_file_commands/save_state_56877b46-0efb-448b-8122-6d4d35217f1e'
}
```

* Introspect

* Try a different shell

* Source the fucking profile perhaps since it might ignore entrypoint

* No source I guess

* use github env, run tests

* Try to fix environment

* Try again

* Fix (I think?) openmp variables for celer-sim app

* Disable GPU tests at configure time if CELER_DISABLE_DEVICE

* Automatically skip device tests if CELER_DISABLE_DEVICE

* Update versions

* Use std allocateor rather than pinned when no device is available

* Use MPI max numprocs when setting NP default

github runner only has 2

* Disable more tests when GPU unavailable

* fixup! Use std allocateor rather than pinned when no device is available

* Enable all jobs

* Support root disabling (and initialization) from error handler

* Disable device tests if device is disabled at runtime

* Fix thread count for celer-sim with ROOT

* REVERTME: disable all but one ROCM image

* Downgrade to checkout v3

* Fix rocm json and jenkins build names

* Fix root/shared options

* fixup! Disable device tests if device is disabled at runtime

* Fix json

* Dubious ownership

* Fix CELERITAS_TEST_VERBOSE

* Fix matrix and tag fetch

* Try fetch depth

* Try single fetch depth, and fix use of accel example

* Try more fetch depth

* Add ld flags

* Disable example builds that don't currently work

* Fix syntax errors

* Add vecgeom-reldb and fix asan diabling

* Disable reldeb example too

* Remove to-dos

* Exclude changes to rst/md in check

* REVERTME: don't disable device

* Revert "REVERTME: don't disable device"

This reverts commit fbdea2d.

* Update 'special' annotations

* Add conflict between hip+assertions and update CI matrix

* Remove profile source and update working dirs

* Update documentation

* Fix image selection and parallelism

* fixup! Fix image selection and parallelism

* Reverse ordering so fine-grained is first
@adminy
Copy link

adminy commented Dec 20, 2023

##[debug]Running JavaScript Action with default external tool: node16
OCI runtime exec failed: exec failed: container_linux.go:380: starting container process caused: no such file or directory: unknown
##[debug]Node Action run completed with exit code 126

kenorb added a commit to EA31337/EA-Tester that referenced this issue Feb 10, 2024
thierrymarianne added a commit to atp-lptp/automated-theorem-proving-for-prolog-verification that referenced this issue Feb 15, 2024
thierrymarianne added a commit to atp-lptp/automated-theorem-proving-for-prolog-verification that referenced this issue Feb 15, 2024
Revised paths
Fixed reference format
Try fixing env var injection
Injecting GitHub secret into container
Added visual help
Injecting workspace path
Fixed path to workspace
Tried fixing permission issue
Added continuous integration status badge
Run tests continuously
Run as root user
Downgrading [checkout](actions/checkout#956)
Trying 1001 instead of root first
Upgraded checkout action
Added job names
Fixed test target
Removed user option
thierrymarianne added a commit to atp-lptp/automated-theorem-proving-for-prolog-verification that referenced this issue Feb 15, 2024
pre-release check
Commenting prover application temporarily for pre-release validation
Fixed LaTeX results making target in continuous integration workflow
Revised container image names
First attempt to produce results archive on pull-request
Declare dependency between release creation and Push docker image
Installing dependencies without interaction
Removed [environment variables setting from job container](https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions\#setting-an-environment-variable)
Write results to ./out directory
Applied custom sort after parsing results
Revised release workflow
Revised working directory
Revised archiving script
Removed left workspace occurrences
Added results to archive
Injecting missing release name into job container
Revise step name
Revise intermediate success rate filename display
Injecting missing release name into job container
Try to avoid [untrusted state](https://github.blog/changelog/2022-10-11-github-actions-deprecating-save-state-and-set-output-commands)
Replace slashes with dashes
Trying to replace `marvinpinto/action-automatic-releases@latest` with official action from GitHub
Revise package updates
Removed bash as default shell
Install dependencies on image build
Removed requirements details
Trying to restore container initialization
Revise step title
Revise paths
Commenting out prover application
Revised paths
Fixed reference format
Try fixing env var injection
Injecting GitHub secret into container
Added visual help
Injecting workspace path
Fixed path to workspace
Tried fixing permission issue
Added continuous integration status badge
Run tests continuously
Run as root user
Downgrading [checkout](actions/checkout#956)
Trying 1001 instead of root first
Upgraded checkout action
Added job names
Fixed test target
Removed user option
Align file owner uid / gid
Removed test detecting missing FR locale from container
thierrymarianne added a commit to atp-lptp/automated-theorem-proving-for-prolog-verification that referenced this issue Feb 15, 2024
Revised paths
Fixed reference format
Try fixing env var injection
Injecting GitHub secret into container
Added visual help
Injecting workspace path
Fixed path to workspace
Tried fixing permission issue
Added continuous integration status badge
Run tests continuously
Run as root user
Downgrading [checkout](actions/checkout#956)
Trying 1001 instead of root first
Upgraded checkout action
Added job names
Fixed test target
Removed user option
Align file owner uid / gid
Removed test detecting missing FR locale from container
geneerik added a commit to xcape-inc/s0ck3t that referenced this issue Feb 29, 2024
vstakhov added a commit to rspamd/rspamd that referenced this issue Mar 15, 2024
niklas-uhl added a commit to kamping-site/kamping that referenced this issue Mar 18, 2024
hegza added a commit to hegza/headsail-vp that referenced this issue Mar 18, 2024
hegza added a commit to hegza/headsail-vp that referenced this issue Mar 18, 2024
hegza added a commit to hegza/headsail-vp that referenced this issue Mar 18, 2024
hegza added a commit to hegza/headsail-vp that referenced this issue Mar 18, 2024
hegza added a commit to hegza/headsail-vp that referenced this issue Mar 18, 2024
@theory
Copy link

theory commented Apr 16, 2024

This limitation requiring workflow Docker containers to run as root is requiring contortions to work around (e.g., pgcentralfoundation/pgrx#1652), and to @romulus-ai's point, it seems like a potentially significant security issue. If there is some way to work around it, like using a specific UID and/or username like runner, it would be great to know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests