Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller does not start due webhook race condition #156

Closed
jmendesky opened this issue Aug 31, 2022 · 0 comments · Fixed by #157
Closed

Controller does not start due webhook race condition #156

jmendesky opened this issue Aug 31, 2022 · 0 comments · Fixed by #157
Milestone

Comments

@jmendesky
Copy link
Collaborator

jmendesky commented Aug 31, 2022

controller-runtime version 0.8.3 has a race condition when setting up watchers before registering webhooks which has been reported in kubernetes-sigs/controller-runtime#1685 and addressed in kubernetes-sigs/controller-runtime#1690 for 0.10.3 and kubernetes-sigs/controller-runtime@612e9b2 for 0.11.0.

Application Logs:

{"level":"info","ts":1661943570.7194216,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":8080"}
{"level":"info","ts":1661943570.7199278,"logger":"controller-runtime.builder","msg":"skip registering a mutating webhook, admission.Defaulter interface is not implemented","GVK":"pipelines.kubeflow.org/v1alpha3, Kind=Pipeline"}
{"level":"info","ts":1661943570.7199678,"logger":"controller-runtime.builder","msg":"skip registering a validating webhook, admission.Validator interface is not implemented","GVK":"pipelines.kubeflow.org/v1alpha3, Kind=Pipeline"}
{"level":"info","ts":1661943570.7201693,"logger":"controller-runtime.webhook","msg":"registering webhook","path":"/convert"}
{"level":"info","ts":1661943570.7202249,"logger":"controller-runtime.builder","msg":"conversion webhook enabled","object":{"name":""}}
{"level":"info","ts":1661943570.720253,"logger":"controller-runtime.builder","msg":"skip registering a mutating webhook, admission.Defaulter interface is not implemented","GVK":"pipelines.kubeflow.org/v1alpha3, Kind=RunConfiguration"}
{"level":"info","ts":1661943570.7202594,"logger":"controller-runtime.builder","msg":"skip registering a validating webhook, admission.Validator interface is not implemented","GVK":"pipelines.kubeflow.org/v1alpha3, Kind=RunConfiguration"}
{"level":"info","ts":1661943570.720288,"logger":"controller-runtime.builder","msg":"conversion webhook enabled","object":{"name":""}}
{"level":"info","ts":1661943570.7203038,"logger":"controller-runtime.builder","msg":"skip registering a mutating webhook, admission.Defaulter interface is not implemented","GVK":"pipelines.kubeflow.org/v1alpha3, Kind=Experiment"}
{"level":"info","ts":1661943570.7203088,"logger":"controller-runtime.builder","msg":"skip registering a validating webhook, admission.Validator interface is not implemented","GVK":"pipelines.kubeflow.org/v1alpha3, Kind=Experiment"}
{"level":"info","ts":1661943570.720331,"logger":"controller-runtime.builder","msg":"conversion webhook enabled","object":{"name":""}}
{"level":"info","ts":1661943570.720806,"logger":"setup","msg":"starting manager"}
I0831 10:59:30.721184       1 leaderelection.go:243] attempting to acquire leader lease kfp-operator-system/kfp-operator-lock...
{"level":"info","ts":1661943570.721256,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
E0831 10:59:30.729849       1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1alpha3.RunConfiguration: failed to list *v1alpha3.RunConfiguration: conversion webhook for pipelines.kubeflow.org/v1alpha2, Kind=RunConfiguration failed: Post "https://kfp-operator-webhook-service.kfp-operator-system.svc:443/convert?timeout=30s": dial tcp 100.84.0.245:9443: connect: connection refused
...

after killing the pod, the following logs are produced suggesting that a lock has been released

{"level":"info","ts":1661944143.2915382,"logger":"controller-runtime.webhook.webhooks","msg":"starting webhook server"}
{"level":"info","ts":1661944143.2915914,"logger":"controller-runtime.manager.controller.experiment","msg":"Starting EventSource","reconciler group":"pipelines.kubeflow.org","reconciler kind":"Experiment","source":"kind source: /, Kind="}
{"level":"info","ts":1661944143.291691,"logger":"controller-runtime.manager.controller.runconfiguration","msg":"Starting EventSource","reconciler group":"pipelines.kubeflow.org","reconciler kind":"RunConfiguration","source":"kind source: /, Kind="}
{"level":"info","ts":1661944143.2918708,"logger":"controller-runtime.manager.controller.pipeline","msg":"Starting EventSource","reconciler group":"pipelines.kubeflow.org","reconciler kind":"Pipeline","source":"kind source: /, Kind="}
{"level":"info","ts":1661944143.2920043,"logger":"controller-runtime.certwatcher","msg":"Updated current TLS certificate"}
{"level":"error","ts":1661944143.2918658,"logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"Timeout: failed waiting for *v1alpha3.Experiment Informer to sync","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/manager/internal.go:530"}
{"level":"error","ts":1661944143.2920704,"logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"Timeout: failed waiting for *v1alpha3.RunConfiguration Informer to sync","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/manager/internal.go:530"}
{"level":"error","ts":1661944143.292142,"logger":"controller-runtime.manager","msg":"error received after stop sequence was engaged","error":"Timeout: failed waiting for *v1alpha3.Pipeline Informer to sync","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/zapr@v0.2.0/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/manager.(*controllerManager).engageStopProcedure.func1\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.8.3/pkg/manager/internal.go:530"}
{"level":"info","ts":1661944143.2921882,"logger":"controller-runtime.webhook","msg":"serving webhook server","host":"","port":9443}
{"level":"info","ts":1661944143.2922742,"logger":"controller-runtime.certwatcher","msg":"Starting certificate watcher"}
{"level":"info","ts":1661944143.2922952,"logger":"controller-runtime.webhook","msg":"shutting down webhook server"}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants