Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConfigMap in use is being deleted #39

Open
bittner opened this issue Sep 22, 2020 · 9 comments
Open

ConfigMap in use is being deleted #39

bittner opened this issue Sep 22, 2020 · 9 comments
Labels
bug Something isn't working

Comments

@bittner
Copy link
Contributor

bittner commented Sep 22, 2020

ConfigMaps that are still in use should not be deleted (as confirmed by the README).

Unfortunately, the current implementation seems to identify configmaps that are still in use. This results in ConfigMap manifests being deleted for deployments that are running at that very moment, and failing volume mounts for configmaps when pods are restarted.

This should be fixed by constructing a test case that manages to reproduce the behavior.

Related source locations

Example

A somewhat complete example of a Kustomize-based deployment, which is followed by a Seiso run:

...
$ pushd deployment/base
/builds/customer/waf-foo-bar/deployment/base /builds/customer/waf-foo-bar
$ kustomize edit add configmap rules-before-crs --from-file='rules/before-crs/*.conf'
$ kustomize edit add configmap rules-after-crs --from-file='rules/after-crs/*.conf'
$ kustomize edit set namesuffix -- "-${APP_NAME}"
$ kustomize edit add label "environment:${ENVIRONMENT}"
$ popd
/builds/customer/waf-foo-bar
$ oc project "${PROJECT_NAME}-${TARGET}"
Now using project "waf-foo-bar" on server "https://console.cluster.customeron.ch".
$ kustomize build deployment/overlays/${TARGET} | kubeval --strict
PASS - stdin contains a valid ConfigMap (modsecurity-config-foo-bar-hkc68d6ck7)
PASS - stdin contains a valid ConfigMap (netdata-config-foo-bar-ghtm9bddh6)
PASS - stdin contains a valid ConfigMap (rules-after-crs-foo-bar-7f7h4gc58t)
PASS - stdin contains a valid ConfigMap (rules-before-crs-foo-bar-7tc6fdbf65)
PASS - stdin contains a valid Service (waf-foo-bar)
PASS - stdin contains a valid Deployment (waf-foo-bar)
$ kustomize build deployment/overlays/${TARGET} | oc apply -f -
configmap/modsecurity-config-foo-bar-hkc68d6ck7 unchanged
configmap/netdata-config-foo-bar-ghtm9bddh6 unchanged
configmap/rules-after-crs-foo-bar-7f7h4gc58t unchanged
configmap/rules-before-crs-foo-bar-7tc6fdbf65 unchanged
service/waf-foo-bar unchanged
deployment.apps/waf-foo-bar configured
$ seiso configmap --delete --older-than 3w -l app=waf-${APP_NAME} -l component=waf -l environment=${ENVIRONMENT}
level=info msg="Deleting ConfigMap kof-jminvest-development/rules-before-crs-foo-bar-7tc6fdbf65"
level=info msg="Deleting ConfigMap kof-jminvest-development/rules-after-crs-foo-bar-h9f582fccb"
Job succeeded

Note the ConfigMap with the "7tc6fdbf65" hash that is being deleted, even though it's obviously included in the deployment.

@bittner bittner added the bug Something isn't working label Sep 22, 2020
@ccremer
Copy link
Contributor

ccremer commented Sep 22, 2020

AFAIK the tests should cover this...
We should also check the possibility for race conditions. Maybe a simple sleep 5 fixes it, or cleanup before applying (could also be dangerous)

@ccremer
Copy link
Contributor

ccremer commented Sep 23, 2020

I noticed that the current oc image is using seiso 0.6.0, which is very outdated. There have been lots of changes since, but no new release tag (maybe it got forgotten). I have now pushed 0.7.0 and I'm going to update the oc image as well.

After that, please try again with Seiso 0.7.0

@ccremer
Copy link
Contributor

ccremer commented Sep 24, 2020

There's now a new oc image available with Seiso 0.7.0. Please try again and see if the issue still persists.

@ccremer
Copy link
Contributor

ccremer commented Oct 23, 2020

Using the latest version of Seiso in the pipeline has resolved the issue for the customer.

@ccremer ccremer closed this as completed Oct 23, 2020
@bittner
Copy link
Contributor Author

bittner commented Jan 29, 2021

This issue is still present in Seiso 0.7.1. We are using the latest appuio/oc image on GitLab. Since seiso is slow computing objects in use we've started to run it after oc apply to avoid delaying the deployment process.

As a result, a deployment on APPUiO half an hour ago deleted the ConfigMap that is being referred to by the active Deployment object:

...
$ seiso configmaps -l app=lawbrary3 --delete
level=info msg="Seiso 0.7.1, commit fddf46ce9081c232cb98084f22abe254fd889165, date 2020-11-17T09:34:43Z"
level=info msg="Deleted ConfigMap lawb-production/grafana-datasources-86t8876df9"
...

Please, reopen the issue!

@ccremer ccremer reopened this Feb 1, 2021
@bittner
Copy link
Contributor Author

bittner commented Feb 17, 2021

Seiso again deleted a ConfigMap in a pipeline run today, even though there was only one ConfigMap left and keep is --keep 3 by default:

...
$ seiso configmaps -l app=lawbrary3 --delete
level=info msg="Seiso 0.7.1, commit fddf46ce9081c232cb98084f22abe254fd889165, date 2020-11-17T09:34:43Z"
level=info msg="Deleted ConfigMap lawb-production/grafana-datasources-86t8876df9"
...

Please try to fix this problem and warn your customers about this bug. It would also help if more information were printed about (default) settings being used and why a candidate was selected for deletion.

The use of seiso configmap is dangerous in its current state.

@ccremer
Copy link
Contributor

ccremer commented Feb 18, 2021

Hi Peter

We understand the situation is annoying for you. Seiso is clearly not production ready.
Would you mind opening a support ticket at control.vshn.net if this is urgent for you? Our engineering focus is currently not with Seiso.

@ghost
Copy link

ghost commented Apr 7, 2021

Hello,

I think it was certainly linked to #49 which fixed the ResourceContains in kubernetes/utils.

Indeed, I did this PR because seiso were deleting currently running images.

@ghost
Copy link

ghost commented Apr 13, 2021

I think it was certainly linked to #49 which fixed the ResourceContains in kubernetes/utils.

To explain why I'm almost sure the current issue is fixed:

The method GetUnused from pkg/configmap/configmap.go, lines 64-89 is using ResourceContains to check if the config map name is used by any resource defined in the namespace resources.

The configmap is only checking resources of types defined in openshift.PredefinedResource. I think it should be right, because this list contains Deployment and ReplicaSet.

Before the fix of #49, ResourceContains was doing this check only for the first resource of the namespace. Now, it checks for all resources if the config id is used.

That's explain why it was difficult to reproduce the configmap deletion bug in real environment.

Furthermore, the config map unit test has mocked the ResourceContains, that's why it always passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

No branches or pull requests

2 participants