-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected Node Pool scaling on unrelated value change #2249
Comments
@eduardOrthopy Hi, could you share more detail better with tf configs? I cannot reproduce this issue, my nodes not scaled down if |
@anton-sidelnikov Thanks for looking into this. I can get something together, but it will be the week after next. Did you wait for the autoscaler to change the node count before running the update? Regarding renaming, ok. I understand that this would be a breaking change. Still might be worth considering should you ever plan a major version release. |
@eduardOrthopy yes I did a lot of different checks, but no luck. Of course we can do it together, just ping me when you ready. |
@anton-sidelnikov I was just about to start working on a minimal reproduction template, and I re-read your reply.
Do you mean that the autoscaler did not work for you? That would be a different issue. I am talking about an issue with I have set my node pool to 7 nodes initially with a minimum of 3 and an autoscaler config of (Please adapt the redacted and region values to your test region): resource "opentelekomcloud_cce_addon_v3" "autoscaler" {
template_name = "autoscaler"
template_version = 1.23.17
cluster_id = opentelekomcloud_cce_cluster_v3.cluster.id
values {
basic = {
"cceEndpoint" = YourRegionEndoint
"ecsEndpoint" = YourRegionEndoint
"region" = YourRegion
"swr_addr" = Redacted
"swr_user" = Redacted
}
custom = {
"cluster_id" = opentelekomcloud_cce_cluster_v3.cluster.id
"tenant_id" = data.opentelekomcloud_identity_project_v3.current.id
"coresTotal" = 16000
"expander" = "priority"
"logLevel" = 4
"maxEmptyBulkDeleteFlag" = 11
"maxNodesTotal" = 100
"memoryTotal" = 64000
"scaleDownDelayAfterAdd" = 15
"scaleDownDelayAfterDelete" = 15
"scaleDownDelayAfterFailure" = 3
"scaleDownEnabled" = true
"scaleDownUnneededTime" = 7
"scaleDownUtilizationThreshold" = "0.2"
"scaleUpCpuUtilizationThreshold" = "0.60"
"scaleUpMemUtilizationThreshold" = "0.75"
"scaleUpUnscheduledPodEnabled" = true
"scaleUpUtilizationEnabled" = true
"unremovableNodeRecheckTimeout" = 7
}
}
} From a node-size perspective, I was using s3.large.4 for testing. Here is an excerpt of the node pool config I used for testing (again, please make changes as per your testing region and setup): resource "random_id" "id" {
byte_length = 4
}
resource "opentelekomcloud_cce_node_pool_v3" "node_pool" {
cluster_id = var.cluster_id
name = var.name != "" ? var.name : "node-pool-${random_id.id.hex}"
flavor = var.node_flavor
initial_node_count = 7
availability_zone = var.availability_zone
key_pair = var.keypair_name
os = var.os
scale_enable = true
min_node_count = 3
max_node_count = 10
scale_down_cooldown_time = 15
priority = 1
user_tags = var.tags
k8s_tags = var.k8s_tags
docker_base_size = 20
root_volume {
size = 50
volumetype = "SSD"
}
data_volumes {
size = 50
volumetype = "SSD"
}
lifecycle {
ignore_changes = [
initial_node_count,
]
create_before_destroy = true
}
timeouts {
create = "60m"
update = "60m"
delete = "60m"
}
} Now as stated, deploy a cluster with the node pool and the autoscaler, wait until the autoscaler removed some nodes, deploy some changes to the node pool, as a result, the node pool will scale back up to the So in our example above:
As I said, up is not terrible, as it will only cost money, but if there are currently more nodes than |
@eduardOrthopy Hello, yes I found this issue, thanks for details, also it reproducible in UI, internal bug report created https://jira.tsi-dev.otc-service.com/browse/BM-2993 |
Terraform provider version
provider registry.terraform.io/opentelekomcloud/opentelekomcloud v1.35.4
Affected Resource(s)
Terraform Configuration Files
upon request
Debug Output/Panic Output
None
Steps to Reproduce
Minimal reproduction
initial_node_count
to something bigger than the minimal node countterraform apply
Expected Behavior
Either: Only the modifications made are applied
Or at least: All modifications are shown in the output of
apply
orplan
Actual Behavior
The node pool is scaled to
initial_node_count
. This is not shown in the output ofplan
and happens regardless of whether theinitial_node_pool
property is part of the ignored properties or not.Important Factoids
References
GH-1961 already references this issue. It was closed with the comment that this is
expected
and a proposed workaround, to basicallybe careful
when touching those resources.Remarks
initial
is misleading in my mind. Maybedesired
would be more indicative of the fact that this value is more than a 'set once' thing.The text was updated successfully, but these errors were encountered: