Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip checking resources when --wait=false is specified #577

Open
firgavin opened this issue Aug 12, 2022 · 7 comments · May be fixed by #594
Open

Skip checking resources when --wait=false is specified #577

firgavin opened this issue Aug 12, 2022 · 7 comments · May be fixed by #594
Assignees
Labels
bug This issue describes a defect or unexpected behavior carvel accepted This issue should be considered for future work and that the triage process has been completed priority/important-soon Must be staffed and worked on currently or soon.

Comments

@firgavin
Copy link

What steps did you take:
I currently use Kapp as a CI tool to manage lots of YAML files. I used --wait=false when I deleted the app because sometimes deleting custom resources will take a long time.

What happened:
kapp exits with non-zero code which makes CI fail.

$ kapp delete -a app1 --wait=false -y
Target cluster 'https://127.0.0.1:6443' (nodes: firgavin)

Changes

Namespace  Name        Kind        Age  Op      Op st.  Wait to  Rs  Ri  
default    simple-app  Deployment  22s  delete  -       -        ok  -  
^          simple-app  Service     22s  delete  -       -        ok  -  

Op:      0 create, 2 delete, 0 update, 0 noop, 0 exists
Wait to: 0 reconcile, 0 delete, 2 noop

11:18:59AM: ---- applying 2 changes [0/2 done] ----
11:18:59AM: delete deployment/simple-app (apps/v1) namespace: default
11:18:59AM: delete service/simple-app (v1) namespace: default
11:18:59AM: ---- waiting on 2 changes [0/2 done] ----
11:18:59AM: ok: noop service/simple-app (v1) namespace: default
11:18:59AM: ok: noop deployment/simple-app (apps/v1) namespace: default
11:18:59AM: ---- applying complete [2/2 done] ----
11:18:59AM: ---- waiting complete [2/2 done] ----

kapp: Error: Expected all resources to be gone, but found: endpointslice/simple-app-vp2dw (discovery.k8s.io/v1) namespace: default, pod/simple-app-64c66864f5-g9sb8 (v1) namespace: default, replicaset/simple-app-64c66864f5 (apps/v1) namespace: default

What did you expect:
Kapp could skip checking resources when --wait=false is specified.

Anything else you would like to add:
I did some research and I found that kapp checks the existence of related resources after applying changes. But resources will be deleted eventually. See https://github.com/vmware-tanzu/carvel-kapp/blob/v0.52.0/pkg/kapp/cmd/app/delete.go#L159.
It would be great if kapp could default to skipping checking resources when --wait=false is specified or add a flag to control this logic. And if that makes sense, I'd like to help implement this ;)

Environment:

  • kapp version (use kapp --version): v0.52.0
  • OS (e.g. from /etc/os-release): Ubuntu 20.04.4 LTS
  • Kubernetes version (use kubectl version): v1.23.6+k3s1

Vote on this request

This is an invitation to the community to vote on issues, to help us prioritize our backlog. Use the "smiley face" up to the right of this comment to vote.

👍 "I would like to see this addressed as soon as possible"
👎 "There are other more important things to focus on right now"

We are also happy to receive and review Pull Requests if you want to help working on this issue.

@firgavin firgavin added bug This issue describes a defect or unexpected behavior carvel triage This issue has not yet been reviewed for validity labels Aug 12, 2022
@firgavin firgavin changed the title Skip checking resources after delete app when --wait=false is specified Skip checking resources when --wait=false is specified Aug 12, 2022
@praveenrewar
Copy link
Member

Yeah, it seems like setting the wait flag to false would currently lead to an error while deleting recorded apps. So definitely it's a bug.

It would be great if kapp could default to skipping checking resources when --wait=false is specified or add a flag to control this logic.

It does makes sense to allow that behaviour, I am just trying to think of any side effects it could have. One obvious thing that could happen is that one or more resources are not deleted but the app itself (metadata configmap) is deleted.
@cppforlife Any thoughts?

And if that makes sense, I'd like to help implement this ;)

That would be great, we will definitely review it on priority once we finalize the approach :)

@renuy renuy added carvel accepted This issue should be considered for future work and that the triage process has been completed and removed carvel triage This issue has not yet been reviewed for validity labels Aug 16, 2022
@renuy
Copy link
Contributor

renuy commented Aug 16, 2022

Hey @firgavin good to see your here. Looking forward to your PR for this issue.

@100mik
Copy link
Contributor

100mik commented Aug 16, 2022

One obvious thing that could happen is that one or more resources are not deleted but the app itself

This would be a "known risk" I guess?

We might also lose out on some "retryable cases", where kapp would retry in case of a failed delete due to a retryable error.

@cppforlife
Copy link
Contributor

I did some research and I found that kapp checks the existence of related resources after applying changes. But resources will be deleted eventually.

i think additional flag would be reasonable to disable this check. may be under dangerous?

@100mik
Copy link
Contributor

100mik commented Aug 18, 2022

i think additional flag would be reasonable to disable this check. may be under dangerous?

This approach makes sense to me

@firgavin
Copy link
Author

Hi @cppforlife, @100mik, @praveenrewar - Thanks for your insights! Here's my proposal:

We can add a flag --dangerous-disable-checking-app-deletion to enable or disable the check:

  • The value is set to false by default, which is compatible with the current behavior.
  • Once the flag is specified, kapp skips this check, and users might need to manually delete related resources.

Before I work on it, I'd like to discuss the interaction between the two flags. When --dangerous-disable-checking-app-deletion=false, should we make sure that the value of --wait is overwritten to True? If not, users can still hit the same issue. Of course, we can explain the usage in the docs if we think they should be "orthogonal". Any suggestions?

@praveenrewar
Copy link
Member

When --dangerous-disable-checking-app-deletion=false, should we make sure that the value of --wait is overwritten to True?

I think that we should keep the working of these 2 flags independent of each other because a user should be able to use --dangerous-disable-checking-app-deletion irrespective of --wait being enabled or disabled.

If not, users can still hit the same issue. Of course, we can explain the usage in the docs if we think they should be "orthogonal". Any suggestions?

Maybe we can add a hint in the error message?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue describes a defect or unexpected behavior carvel accepted This issue should be considered for future work and that the triage process has been completed priority/important-soon Must be staffed and worked on currently or soon.
Projects
Status: To Triage
Development

Successfully merging a pull request may close this issue.

5 participants