Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Operator reconciliation fails when setting fractional milli resource requests #5435

Open
bcvanmeurs opened this issue Mar 14, 2024 · 0 comments
Labels

Comments

@bcvanmeurs
Copy link

Describe the bug

When setting a resource request of 1.5m or any partial millicpu request, the operator manager fails to reconcile the deployment.

To reproduce

  1. Create a deployment with a custom model container image
  2. Set the resource request to a fractional millicpu
  3. Deploy

The operator manager cannot reconcile, and generates millions of logs per day.
I know we are not supposed to set fractional millicpu, but k8s does not fail if you do so. It rounds the value up to nearest integer.
I suspect this is where the reconciliation error comes from and the operator sees a difference between the desired state: 1.5m and the actual state 2m. And tries to reconcile, without success, generating many logs.

Expected behaviour

Although we were not supposed to set fractional millicpus, it took a while to find the root cause of the many logs.
I expect: either the operator to return an error to disallow fractional millicpu values, or also silently round the value up to the nearest integer, like k8s does by default.

Environment

AWS

@bcvanmeurs bcvanmeurs added the bug label Mar 14, 2024
@bcvanmeurs bcvanmeurs changed the title Operator reconciliation fails when setting resource requests Operator reconciliation fails when setting fractional milli resource requests Mar 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant