Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ webhook: add an option to recover from panics in handler #1900

Merged

Conversation

isitinschi
Copy link
Contributor

Currently, a panic occcurence in a webhook handler is not recovered and crashes the webhook server.

This change adds an option to Webhook to recover panics similar to how it is handled in Reconciler. It ensures that panics are converted to normal error response.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 13, 2022
@k8s-ci-robot
Copy link
Contributor

Hi @isitinschi. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 13, 2022
@k8s-ci-robot k8s-ci-robot requested a review from droot May 13, 2022 18:26
@k8s-ci-robot k8s-ci-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label May 13, 2022
@k8s-ci-robot k8s-ci-robot requested a review from gerred May 13, 2022 18:26
@isitinschi
Copy link
Contributor Author

@vincepri WDYT?

@christopherhein
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 13, 2022
"net/http"

utilruntime "k8s.io/apimachinery/pkg/util/runtime"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sort the imports

@@ -121,6 +124,9 @@ type Webhook struct {
// and potentially patches to apply to the handler.
Handler Handler

// RecoverPanic indicates whether the panic caused by webhook should be recovered.
RecoverPanic bool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be better if we support to set it in pkg/builder/webhook.go.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jun 30, 2022
@isitinschi
Copy link
Contributor Author

/test pull-controller-runtime-test-master

@isitinschi
Copy link
Contributor Author

@FillZpp thank you for the review. I've addressed all your suggestions. PTAL

@@ -68,6 +69,12 @@ func (blder *WebhookBuilder) WithValidator(validator admission.CustomValidator)
return blder
}

// WithRecoverPanic takes a bool flag which indicates whether the panic caused by webhook should be recovered.
func (blder *WebhookBuilder) WithRecoverPanic(recoverPanic bool) *WebhookBuilder {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about just defined as func (blder *WebhookBuilder) RecoverPanic() * WebhookBuilder so that recoverPanic should be true if it is called?

@@ -33,9 +33,10 @@ type Defaulter interface {
}

// DefaultingWebhookFor creates a new Webhook for Defaulting the provided type.
func DefaultingWebhookFor(defaulter Defaulter) *Webhook {
func DefaultingWebhookFor(defaulter Defaulter, recoverPanic bool) *Webhook {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a little concern about these public API change, not sure if there were some ppl directly use them.

So how about a new func (wh *Webhook) WithRecoverPanic*(enable bool) *Webhook and you can use it in builder/webhook like this return admission.ValidatingWebhookFor(validator).WithRecoverPanic(blder. recoverPanic)

@isitinschi isitinschi force-pushed the webhooks-recover-panics branch 2 times, most recently from 85eab91 to d52d0fe Compare July 1, 2022 10:56
@isitinschi
Copy link
Contributor Author

@FillZpp is it better like this?

@FillZpp
Copy link
Contributor

FillZpp commented Jul 1, 2022

/lgtm

/cc @alvaroaleman

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 1, 2022
@isitinschi
Copy link
Contributor Author

@alvaroaleman Does it look good to you to be merged?

func (wh *Webhook) Handle(ctx context.Context, req Request) (response Response) {
defer func() {
if r := recover(); r != nil {
if wh.RecoverPanic {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this if condition be before the recover()?

Copy link
Contributor Author

@isitinschi isitinschi Jul 8, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't be able to process the panic if we don't recover from it first. Panic recovery works the same way in controller.go:

// Reconcile implements reconcile.Reconciler.
func (c *Controller) Reconcile(ctx context.Context, req reconcile.Request) (_ reconcile.Result, err error) {
defer func() {
if r := recover(); r != nil {
if c.RecoverPanic {
for _, fn := range utilruntime.PanicHandlers {
fn(r)
}
err = fmt.Errorf("panic: %v [recovered]", r)
return
}
log := logf.FromContext(ctx)
log.Info(fmt.Sprintf("Observed a panic in reconciler: %v", r))
panic(r)
}
}()
return c.Do.Reconcile(ctx, req)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But we don't want to process it when RecoverPanic is false? What is done in controller.go seems quite pointless to me and IIRC the r won't contain the stacktrace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, let's make RecoverPanic flag more fair and really not process anything if it is set to false. Adjusted my PR accordingly 👍

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 13, 2022
@isitinschi
Copy link
Contributor Author

@vincepri @alvaroaleman does it look good to you?

Comment on lines 157 to 158
defer func() {
if wh.RecoverPanic {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
defer func() {
if wh.RecoverPanic {
if wh.RecoverPanic
defer func() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ✅

Currently, a panic occcurence in a webhook handler is not recovered and crashes the webhook server.

This change adds an option to Webhook to recover panics similar to how it is handled in Reconciler. It ensures that panics are converted to normal error response.
@alvaroaleman
Copy link
Member

Thank you!
/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 15, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, isitinschi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2022
@k8s-ci-robot k8s-ci-robot merged commit 88234a8 into kubernetes-sigs:master Jul 15, 2022
@k8s-ci-robot k8s-ci-robot added this to the v0.10.x milestone Jul 15, 2022
@isitinschi isitinschi deleted the webhooks-recover-panics branch July 18, 2022 08:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants