Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RevisionFailed when ServiceAccount references a non-existing imagePullSecret #10575

Closed
maschmid opened this issue Jan 18, 2021 · 15 comments · Fixed by #13701
Closed

RevisionFailed when ServiceAccount references a non-existing imagePullSecret #10575

maschmid opened this issue Jan 18, 2021 · 15 comments · Fixed by #13701
Assignees
Labels
area/API API objects and controllers good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. kind/enhancement lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Issues which should be fixed (post-triage)
Milestone

Comments

@maschmid
Copy link
Contributor

Having a default ServiceAccount in a namespace referencing an imagePullSecrets that doesn't exist anymore in the namespace, creating a ksvc gets its Revision failed:

Revision "helloworld-go-00001" failed with message: Unable to fetch image "gcr.io/knative-samples/helloworld-go": failed to resolve image to digest: failed to initialize authentication: secrets "non-existing-pull-secret" not found.

Deploying the same image as an ordinary pod works fine with the same ServiceAccount.

What version of Knative?

0.20.0

Expected Behavior

Creating a Ksvc should work even if some of the image pull secrets don't exist, if they are not actually needed to pull the image.

Actual Behavior

A Revision fails.

Steps to Reproduce the Problem

Modify the default SA in a namespace to include a non-existing imagePullSecret:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: default
  namespace: default
secrets:
- name: default-token-6trqg
imagePullSecrets:
- name: non-existing-pull-secret

Create a Ksvc:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: helloworld-go
spec:
  template:
    spec:
      containers:
      - image: gcr.io/knative-samples/helloworld-go

Notice the ksvc revision fails:

Revision "helloworld-go-00001" failed with message: Unable to fetch image "gcr.io/knative-samples/helloworld-go": failed to resolve image to digest: failed to initialize authentication: secrets "non-existing-pull-secret" not found.

Such behaviour seems inconsistent with how k8s works, as the same image with the same ServiceAccount works fine as an ordinary Pod:

apiVersion: v1
kind: Pod
metadata:
  name: hello
spec:
  containers:
  - image: gcr.io/knative-samples/helloworld-go
    name: hello
@maschmid maschmid added the kind/bug Categorizes issue or PR as related to a bug. label Jan 18, 2021
@dprotaso dprotaso added this to the Needs Triage milestone Jan 18, 2021
@dprotaso
Copy link
Member

dprotaso commented Jan 18, 2021

/area API

Such behaviour seems inconsistent with how k8s works, as the same image with the same ServiceAccount works fine as an ordinary Pod:

I'm sort of in-different to this statement. Although we differ with K8s, their behaviour may not be intentional/defined. You could argue Knative's fail-fast helps more than in hinders.

@knative-prow-robot knative-prow-robot added the area/API API objects and controllers label Jan 18, 2021
@markusthoemmes
Copy link
Contributor

Hmm, I tend to agree with @dprotaso here. Things might go breaking in unexpected ways using this I think, as some images will work and some won't 🤔. Not a strong opinion either though.

@evankanderson
Copy link
Member

Would it make sense to handle "missing imagePullSecret" as equivalent to "imagePullSecret not specified"?

/kind enhancement
/good-first-issue
/triage needs-user-input

@knative-prow-robot
Copy link
Contributor

@evankanderson:
This request has been marked as suitable for new contributors.

Please ensure the request meets the requirements listed here.

If this request no longer meets these requirements, the label can be removed
by commenting with the /remove-good-first-issue command.

In response to this:

Would it make sense to handle "missing imagePullSecret" as equivalent to "imagePullSecret not specified"?

/kind enhancement
/good-first-issue
/triage needs-user-input

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@knative-prow-robot knative-prow-robot added kind/enhancement triage/needs-user-input Issues which are waiting on a response from the reporter good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Mar 16, 2021
@dprotaso
Copy link
Member

Going to close this out as I haven't seen a compelling enough reason to change the behaviour

@SeanEmac
Copy link

We encountered this problem recently in production. We had a service account with 2 pull secrets, one for the dev registry the other for the prod registry. Both exist in our dev clusters but only the prod one exists in production... you can guess what happened next.

Our fault for not testing properly, but it would be nice to have the flexibility of defining secrets that might/might not exist. Other teams using the same clusters don't see an issue because as mentioned an ordinary pod does not behave this way.

@dprotaso dprotaso removed this from the Needs Triage milestone Jun 16, 2022
@dprotaso
Copy link
Member

This surfaced again with a user using the following tooling: http://external-secrets.io/v0.5.6/

From slack: https://knative.slack.com/archives/C0186KU7STW/p1655389675950159

I have an image that uses imagePullSecrets and I have an ExternalSecret that pulls it from an (azure) keyvault entry, like this:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: frontend
spec:
  template:
    spec:
      containers:
      - name: frontend
        image: frontend:v1
      imagePullSecrets:
      - name: regcred
--- 
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: secretstore-keyvault-secret-regcred
  namespace: rental
spec:
  refreshInterval: 1h
  secretStoreRef:
    kind: SecretStore
    name: secretstore-keyvault
  target:
    name: regcred
    template:
      type: kubernetes.io/dockerconfigjson
    creationPolicy: Owner
  data:
  - secretKey: .dockerconfigjson
    remoteRef:
      key: DockerRegCred

Problem is, when first loading this, the secret has not been loaded yet at the time it's trying to pull the revision so it doesn't exist. Is there a way to say "retry after secret is available" for images? (This is only a problem for the "initial setup", when pushing new revisions it's perfectly fine).

@dprotaso dprotaso reopened this Jun 16, 2022
@dprotaso
Copy link
Member

/lifecycle frozen
/triage accepted

I believe this will require changes to k8schain https://github.com/google/go-containerregistry/tree/main/pkg/authn/k8schain

And we don't want to tight loop reconciliation if the secret isn't present - so we probably want an exponential fallback until the revision progress deadline passes

@knative-prow knative-prow bot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Issues which should be fixed (post-triage) labels Jun 16, 2022
@dprotaso dprotaso added this to the Icebox milestone Jun 16, 2022
@dprotaso dprotaso removed the triage/needs-user-input Issues which are waiting on a response from the reporter label Jun 16, 2022
@evankanderson
Copy link
Member

I think there may be two different issues here:

  1. If an imagePullSecret is missing from the service account, but the remaining secrets are sufficient to resolve the digest, a polite warning / info condition about missing secrets seems sufficient.
  2. If an imagePullSecret is missing from the service account and we're unable to resolve the tag to a digest, we may want to enter a long-term "awaiting" condition rather than timing out more aggressively.

@carlokok
Copy link

Note that in my use case (#10575 (comment)) I don't use a service account, ExternalSercret pulls azure keyvault secrets and places them directly into a secret:


apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: secretstore-keyvault-secret-regcred
  namespace: rental
spec:
  refreshInterval: 1h
  secretStoreRef:
    kind: SecretStore
    name: secretstore-keyvault
  target:
    name: regcred
    template:
      type: kubernetes.io/dockerconfigjson
    creationPolicy: Owner
  data:
  - secretKey: .dockerconfigjson
    remoteRef:
      key: DockerRegCred

From what I can see, the secret just doesn't exist at all when the revision gets pulled and gets updated a while after that (I have a fair amount of secrets, can take several seconds). Not sure if that's important for the case.

@tydra-wang
Copy link

see google/go-containerregistry#1472

k8schain will ignore non-existing imagepullsecrets now. Upgrade github.com/google/go-containerregistry to latest version in go.mod could fix this.

@vishal-chdhry
Copy link
Member

@dprotaso Hi! I am new to knative and I would like to contribute to this issue, is this still open?

@dprotaso
Copy link
Member

We haven't bumped the dependency so this issue is up for grabs.

@dprotaso dprotaso modified the milestones: Icebox, v1.10.0 Feb 13, 2023
@Bisht13
Copy link
Contributor

Bisht13 commented Feb 14, 2023

/assign @Bisht13

@Bisht13
Copy link
Contributor

Bisht13 commented Feb 14, 2023

I have made a PR #13701, could you please review it? @dprotaso

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/API API objects and controllers good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. kind/enhancement lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. triage/accepted Issues which should be fixed (post-triage)
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

10 participants