New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to build&push image on PR via label #604
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice PR with a demo recording and nice that you prepared the alternative approach 👌
Honestly, I'm not a fan of the idea to run the job on label as it requires manual action 🙂 I know we can also automate adding such label, but if we can prevent repushing images without needing reviewer to spot some security issues, without manual action, then why wouldn't we want to go for it? (I mean the alternative approach).
My biggest concern, apart from having the manual action, is that the reviewer is responsible for making sure the PR can be safely built (he/she needs to spot any overrides in Bash scripts, Makefile etc.). With the alternative approach, there's simply no option to mess up with the Packages as pushing is isolated from user.
You wrote the alternative approach is more complex (it saves Docker image and adds another job which pushes the image), but from the projects we worked on in the past, I think this wasn't the problematic part which caused some issues. The most problematic thing was the pull_request_target
event (sometimes unexpected) behavior and that's something we still have here in this approach. Also, this is quite questionable whether "saving Docker image as an artifact and pushing it in second job" is more complex than "automatically adding label for org memebers/trusted external contributors" (as we would need to have it to say the solutions are on par).
Also, you said on the demo recording, that sometimes the image build is not necessary, so with that label we can avoid builds when doing e.g. documentation. That's true, but we can also ignore this job for specific paths (e.g. all MD files) automatically in GH Actions job configuration.
So personally I would pick the alternative approach as it can be even simpler in a long run. I would like to have as many things automated as possible, and currently I don't see the advantages of the building images on label (I would say otherwise, now I see even more benefits of the alternative approach). But of course I'm open for further discussion 🙂
Hi!
yes, not only based on code, but also based on the flow. In the
yes, and that's also a good reminder, because even if we will apply the alternative solution, we still need to pay attention to it. It's really important to do not remember about that, if you will approve a PR that where you didn't "(..)spot any overrides in Bash scripts, Makefile" then we are in the same bad position. Hiding that, and forgetting about it, is not a save option too.
yes, but from our experience, we also know that it's sometimes hard to keep the list of files up to date. I will finish the alternative solution and will test that on my fork. I will keep you posted. |
I tested the alternative solution and here is a working version: #611 Here is GitHub job execution: https://github.com/mszostok/botkube/runs/6714112689?check_suite_focus=true I'm still voting for the first option, as described in the comment above. After testing the alternative solution, I'm also worried about the artifact size. It's 272 MB, see: https://github.com/mszostok/botkube/actions/runs/2429967388 I added - name: Delete Docker image artifact
uses: geekyeggo/delete-artifact@v1
if: always()
with:
name: ${{ matrix.APP }}-${{github.sha}} I'm not able to check what's the limit here, but it looks like 500 MB: https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions so it means that no more than 2 concurrent builds cannot be scheduled. It could become problematic in the long run. |
We cannot merge something without review, that's obvious. I was referring to one huge (at least for me) difference: In the alternative approach we can run this job safely before review (which can usually save time spent on review as there might be some build issues which contributor can fix before the review). With the label-based approach we need to review PR every time before running build, which increases time for review. Correct? When it comes to the artifact size, I think we can think about building multiarch image only on I'm sorry, but personally I still don't see any advantages in the label-based approach over the alternative solution. We can discuss it offline though. I would be happy to review the PR with alternative approach. BTW |
Nope, it's only about image. Build is triggered anyway, see: https://github.com/infracloudio/botkube/blob/7735b836e877858e7054caf027055ff421d2c627/.github/workflows/ci.yml#L42-L56
That will be yet another difference in the setup, which personally, I don't like to introduce.
as already mentioned:
yes, and you also don't want to run a docker image on your computer if you don't trust it, right? So the image is not needed before review, it's always a good practice to review the code and later execute it on your computer 🙂 and in both places you have manual action, enabling PR image build on demand via putting a label is a manual action, same as adding approve. In k8s repos, they are also based on manual action If the second approach doesn't require |
Yes, it is just about building image, but as I wrote before, we can detect some image build issues before, which saves time before actual review. I didn't say anything about running such image on my local machine before review. Moreover, at later point it would be great to have Helm chart installed on some local cluster (e.g. k3d) on CI and do some true E2E tests (e.g. with Mattermost/later with RocketChat), to unload maintainers. Which would make the E2E job depending on building image, which would require manual action.
I already responded why your proposed solution is on par when it comes to complexity, or even more complex (if you want to do automatic labeling for org members and external contributors, and also automatic unlabeling once job is ran or canceled). And we have that already working. BTW even if we went with such option, we would need such automation right now, as I can't imagine that every maintainer is forced to label PR and wait for image build and unlabel PR when a contributor pushes changes 😄
This is totally untrue, as an automatically ran job in the alternative solution cannot override any other Docker images. In your labeled-based approach it can, so your labeled-based approach hides a true security risk. People are always the weakest points when it comes to security. That manual action repeated many times over time won't prevent anything - it will end in the opposite. So the alternative approach brings an additional layer of security, which I am voting for. BTW you need to unlabel the PR from external contributor once the image is built, right? To avoid repushing images when a contributor pushes changes to the branch? This is even more error-prone than I thought.
And that's first valid point which resonates with me, however firstly you would need to confirm the limit for open source and if this is a limit per job or not. We need to see why this is so big (even if this is multiarch), and even if we split some components to multiple binaries, I don't think we will split Docker images, maybe we will do something like istiod - let's see. So I wouldn't treat that as a blocker as we need to see how BotKube evolves and how many optimizations we can do first. I'm still voting for a better security and more automation - that is the alternative approach. |
To sum up, I see pros and cons in both approaches, and I'm OK-ish with both of them. As discussed f2f with Paweł, we would like to ask you @PrasadG193 to check both approaches and let us know about your thoughts 👍 Maybe you know some alternative approach? |
Close in favor of #611 |
##### ISSUE TYPE - Feature Pull Request ##### SUMMARY Add an option to push the BotKube image automatically on PR. It's alternative approach for #604. This PR will solve the problem with manual PR builds, e.g. we had that issue here: - #601 - #593 - #582 - #583 Example run: https://github.com/mszostok/botkube/runs/6714112689?check_suite_focus=true Fixes #590 To ensure that secrets won't be available for untrusted code, first we need to build the image and share it with the second job, which doesn't check out the untrusted code and can safely push an artifact to ghcr.io. The flow is as follows: ``` Job1: image build -> image save -> artifact upload Job2: artifact download -> image load -> image push ``` Job1—runs untrusted code but without write repo perms Job2—push built image with package write perms #### Security This article describes it well: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/
ISSUE TYPE
SUMMARY
Add an option to push the BotKube image automatically on PR. Because of the security reasons, it's enabled only if the
build-pr-image
label is added—since external users do not have the permission to assign labels, this effectively requires repository owners to manually review the changes first.Here is recorded demo https://www.loom.com/share/2f0fdc621a144d32861d83946bfff1d7 and repository on which I tested that change: https://github.com/mszostok/botkube/pulls
This PR will solve the problem with manual PR builds, e.g. we had that issue here:
golangci-lint
and resolve all found issues #593Fixes #590
After merge
A new label
build-pr-image
needs to be created.Further improvements
build-pr-image
automatically, for all trusted members (org collaborators). As a result, we don't need to remember about that.Alternative solution
The alternative doesn't require manual action. However, it's more complicated.
See: https://github.com/infracloudio/botkube/compare/develop...mszostok:build-images-on-prs-v2?expand=1
To ensure that secrets won't be available for untrusted code, first we need to build the image and share it with the second job, which doesn't check out the untrusted code and can safely push an artifact to ghcr.io.
Security
This article describes it well: https://securitylab.github.com/research/github-actions-preventing-pwn-requests/