Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Process webhook refresh in background to not block the request (#14269) #18173

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

dhruvang1
Copy link
Contributor

@dhruvang1 dhruvang1 commented May 12, 2024

This PR moves the webhook processing for both Application and ApplicationSet to a goroutine. This would allow the server to quickly send HTTP 200 to the webhook request adhering to the quick response guidelines set by GitHub, GitLab, etc.

Checklist:

  • Either (a) I've created an enhancement proposal and discussed it with the community, (b) this is a bug fix, or (c) this does not need to be in the release notes.
  • The title of the PR states what changed and the related issues number (used for the release note).
  • The title of the PR conforms to the Toolchain Guide
  • I've included "Closes [ISSUE #]" or "Fixes [ISSUE #]" in the description to automatically close the associated issue.
  • I've updated both the CLI and UI to expose my feature, or I plan to submit a second PR with them.
  • Does this PR require documentation updates?
  • I've updated documentation as required by this PR.
  • I have signed off all my commits as required by DCO
  • I have written unit and/or e2e tests for my change. PRs without these are unlikely to be merged.
  • My build is green (troubleshooting builds).
  • My new feature complies with the feature status guidelines.
  • I have added a brief description of why this PR is necessary and/or what this PR solves.
  • Optional. My organization is added to USERS.md.
  • Optional. For bug fixes, I've indicated what older releases this fix should be cherry-picked into (this may or may not happen depending on risk/complexity).

Fixes #14269

@dhruvang1 dhruvang1 requested a review from a team as a code owner May 12, 2024 01:10
@dhruvang1 dhruvang1 force-pushed the process-webhook-in-background branch from 0db36bd to 81d564b Compare May 12, 2024 17:07
…rgoproj#14269)

Signed-off-by: dhruvang1 <dhruvang1@users.noreply.github.com>
Comment on lines +183 to +187
h.Add(1)
go func() {
defer h.Done()
h.HandleEvent(payload)
}()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a chance that this can be spammed, we should keep the no. of possible routines to a bounded no. configurable by the user. We should follow a worker pool model here instead of spawning 1:1 go routines for every request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a problem with existing implementation as well. This changes simply terminates the go routine handling the request and creates a new one, instead of continuing the request go routine. IMO, this change is no worse at handling the spam than existing implementation (since go can create thousands of go routines without breaking a sweat). We are more likely to get OOM in case of spam, since each request independently pulls all Apps/AppSet in memory than noticing any problems with go routines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good point though the no. of simultaneous connections that can possibly be created by the webhook server is restricted by the no. of available sockets/fds for the container, which is bounded and as far as I know can be configured if required. With this change the active connection doesn't live for too long even though a background job has been spun up which is eating the resources behind the scene so we should have the option to limit the no. of active routines. The limit can be hundreds of thousands but that's dependent on the usage of the actual user and the resources they are willing to provide to the controller.

To better handle spam we can probably dedup requests that come in at a quick succession.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Too many applications causing webhook to timeout
2 participants