Using service containers #152

jenstroeger · 2022-07-07T11:13:12Z

I tried out the harden-runner action (based on this repo) and

with:
  egress-policy: audit

and it worked, until I added a PostgreSQL service container to run a few tests. It looks like traffic to that container is blocked? I tried to add

allowed-endpoints: >
  localhost:5432  # `postgres:5432` doesn’t work either

but neither of these two worked. It’s a private organization and I don’t have the privileges to install the app to check the egress audit log.

I disabled the step, and all tests pass just fine. How do you recommend to proceed?

Much thanks!

The text was updated successfully, but these errors were encountered:

Fich0Gl · 2022-07-07T11:52:33Z

other variants of this are 127.0.0.1:5432 and 0.0.0.0:5432. I don't know if it is worth trying, the best way to debug this would be to add the Harden Runner App to check whether the port is open or not.
I'll do further investigation about this

varunsh-coder · 2022-07-07T14:00:02Z

Sorry to hear that the traffic to the service container is blocked. That is not expected. Both in audit and block mode, localhost traffic is not supposed to be blocked.

I looked at the documentation and see that it has examples of using service container with a container element e.g. container: node:10.18-jessie and also without it. Can you please confirm if you are using container element or running directly on the runner machine? harden-runner is not supported if used with container element (though it should not have blocked any traffic in that case).

Also, when using it in a private repo, you will need to install the App. Else it cannot download the build log and correlate outbound traffic with each step. It only needs actions: read permission.

varunsh-coder · 2022-07-07T16:48:26Z

@h0x0er can you please try to repro this issue on a public repo? You can use the workflow from here: https://docs.github.com/en/actions/using-containerized-services/creating-postgresql-service-containers#running-jobs-directly-on-the-runner-machine

jenstroeger · 2022-07-07T20:51:24Z

@varunsh-coder the build job that fails looks something like this:

  build:
    name: Check Python ${{ matrix.python }} on ${{ matrix.os }}
    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: false
      matrix:
        os: [ubuntu-latest]  # Enable more later.
        python: ['3.9', '3.10']
    services:
      postgres:
        image: postgres:14
        env:
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: testdb
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        ports:
        - 5432:5432
    steps:
      # Disable so tests succeed.
      - name: Harden Runner
        uses: step-security/harden-runner@248ae51c2e8cc9622ecf50685c8bf7150c6e8813
        with:
          egress-policy: audit
      #     allowed-endpoints: >
      #       postgres:5432 # PostgreSQL service container
    - name: Checkout
      uses: actions/checkout@d0651293c4a5a52e711f25b41b05b2212f385d28
    - name: Set up Python
      uses: actions/setup-python@d09bd5e6005b175076f227b13d9730d56e9dcfcb
      with:
        python-version: ${{ matrix.python }}
    - name: Install dependencies
      run: make setup
    - name: Run tests
      run: make test
      # The tests use SQLAlchemy as ORM, and connecting to the db fails.

The Action log shows the following error:

E       sqlalchemy.exc.OperationalError: (psycopg2.OperationalError) connection to server at "localhost" (::1), port 5432 failed: Connection refused
E       	Is the server running on that host and accepting TCP/IP connections?
E       connection to server at "localhost" (127.0.0.1), port 5432 failed: Connection refused
E       	Is the server running on that host and accepting TCP/IP connections?
E       
E       (Background on this error at: https://sqlalche.me/e/14/e3q8)

When I comment out the harden-runner all tests pass as expected.

h0x0er · 2022-07-12T09:01:31Z

@varunsh-coder I had completed my investigation, the error indeed is occurring because of restarting the docker daemon . To fix this issue, we just need to add an extra flag --restart always in service options. check here

checkout this workflow

@jenstroeger after applying the below fix; the workflow will run normally with harden-runner.

        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
          --restart always

jenstroeger · 2022-07-12T18:46:51Z

Thank you, @h0x0er.

One question: why would the PostgreSQL container stop running, thus warranting the automatic restart? I wasn’t able to find details on the health options but perhaps they’re insufficient if subsequent jobs take too long? I mean 5 retries on a 5s timeout is 25 seconds, not sure where exactly the interval comes in here 🤔

varunsh-coder · 2022-07-12T18:59:26Z

Thank you, @h0x0er.

One question: why would the PostgreSQL container stop running, thus warranting the automatic restart? I wasn’t able to find details on the health options but perhaps they’re insufficient if subsequent jobs take too long? I mean 5 retries on a 5s timeout is 25 seconds, not sure where exactly the interval comes in here 🤔

@jenstroeger I can answer your question.

harden-runner Github Action installs an agent to monitor the build process. That agent runs a DNS proxy and additional monitoring on the Ubuntu VM. As part of that, the agent needs to restart the docker daemon. You can see the code below:

https://github.com/step-security/agent/blob/main/dnsconfig.go#L169

Normally, by the time all this happens, in the pre harden-runner step, no images have started to run and no workflow steps have run. But in this scenario, the PostgreSQL container is started before the pre harden-runner step. So when the docker daemon is restarted, this image stops running and doesn't restart on its own.

I hope this answers your question.

@h0x0er is trying to figure out if as part of restarting docker daemon, we can restart all images that were already running. But if we cannot figure that out, we will need to add documentation to add the --restart always argument in this scenario.

varunsh-coder · 2022-07-14T16:45:33Z

@h0x0er was able to figure out a way to restart existing running containers as part of the docker daemon restart. I will test out the changes and release next week. After new version is released, you will not need to add --restart always. It should just work as expected.

jenstroeger · 2022-07-14T21:16:13Z

@varunsh-coder thanks for the update! I’ll wait for the next release and then update on my end, and I’ll let you know whether it works.

varunsh-coder · 2022-08-12T17:39:33Z

This is fixed in the latest release v1.4.5 with tag dd2c410b088af7c0dc8046f3ac9a8f4148492a95.
You should not need any workaround for this to work.
We have also added an integration test for it. Here is an example workflow run:https://github.com/harden-runner-canary/postgres-testing/runs/7810960312?check_suite_focus=true#step:9:10 and insights URL: https://app.stepsecurity.io/github/harden-runner-canary/postgres-testing/actions/runs/2848365031

jenstroeger · 2022-09-13T10:46:36Z

You should not need any workaround for this to work.

Confirming that the change works.

h0x0er mentioned this issue Jul 14, 2022

Reload docker daemon instead of restart step-security/agent#266

Merged

swissspidy mentioned this issue Jul 20, 2022

CI: GitHub Actions Security Hardening GoogleForCreators/web-stories-wp#11962

Merged

varunsh-coder mentioned this issue Aug 12, 2022

Release v1.4.5 #156

Merged

2 tasks

varunsh-coder closed this as completed in #156 Aug 12, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using service containers #152

Using service containers #152

jenstroeger commented Jul 7, 2022 •

edited

Fich0Gl commented Jul 7, 2022 •

edited

varunsh-coder commented Jul 7, 2022

varunsh-coder commented Jul 7, 2022

jenstroeger commented Jul 7, 2022

h0x0er commented Jul 12, 2022

jenstroeger commented Jul 12, 2022

varunsh-coder commented Jul 12, 2022

varunsh-coder commented Jul 14, 2022

jenstroeger commented Jul 14, 2022

varunsh-coder commented Aug 12, 2022

jenstroeger commented Sep 13, 2022

Using service containers #152

Using service containers #152

Comments

jenstroeger commented Jul 7, 2022 • edited

Fich0Gl commented Jul 7, 2022 • edited

varunsh-coder commented Jul 7, 2022

varunsh-coder commented Jul 7, 2022

jenstroeger commented Jul 7, 2022

h0x0er commented Jul 12, 2022

jenstroeger commented Jul 12, 2022

varunsh-coder commented Jul 12, 2022

varunsh-coder commented Jul 14, 2022

jenstroeger commented Jul 14, 2022

varunsh-coder commented Aug 12, 2022

jenstroeger commented Sep 13, 2022

jenstroeger commented Jul 7, 2022 •

edited

Fich0Gl commented Jul 7, 2022 •

edited