Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth" #143

nitrocode · 2022-01-27T20:31:48Z

Found a bug? Maybe our Slack Community can help.

Describe the Bug

I noticed that if the eks cluster is switching subnets, particularly public + private, to only private, the eks cluster will return an endpoint of localhost (for aws_eks_cluster.default[0].endpoint).

╷
│ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused
│
│   with module.eks_cluster.kubernetes_config_map.aws_auth[0],
│   on .terraform-mdev/modules/eks_cluster/auth.tf line 135, in resource "kubernetes_config_map" "aws_auth":
│  135: resource "kubernetes_config_map" "aws_auth" {

Related code

terraform-aws-eks-cluster/auth.tf

Lines 88 to 97 in b745ed1

    
           provider "kubernetes" { 
        
             # Without a dummy API server configured, the provider will throw an error and prevent a "plan" from succeeding 
        
             # in situations where Terraform does not provide it with the cluster endpoint before triggering an API call. 
        
             # Since those situations are limited to ones where we do not care about the failure, such as fetching the 
        
             # ConfigMap before the cluster has been created or in preparation for deleting it, and the worst that will 
        
             # happen is that the aws-auth ConfigMap will be unnecessarily updated, it is just better to ignore the error 
        
             # so we can proceed with the task of creating or destroying the cluster. 
        
             # 
        
             # If this solution bothers you, you can disable it by setting var.dummy_kubeapi_server = null 
        
             host                   = local.enabled ? coalesce(aws_eks_cluster.default[0].endpoint, var.dummy_kubeapi_server) : var.dummy_kubeapi_server

Workaround 1

To get around this issue, I have to delete the kubeconfig map resource and then the module can be tricked to redeploying the eks cluster (due to the change in subnets).

cd components/terraform/eks/eks

terraform state rm 'module.eks_cluster.kubernetes_config_map.aws_auth[0]'

or in atmos

atmos terraform state eks/eks --stack dev-use2-qa rm 'module.eks_cluster.kubernetes_config_map.aws_auth[0]'

if this workaround was done by mistake, you can re-import the deleted config

atmos terraform import eks/eks --stack dev-use2-qa 'module.eks_cluster.kubernetes_config_map.aws_auth[0]' kube-system/aws-auth

Workaround 2

I've also ran into this issue when importing an existing cluster into the terraform module. My workaround for the import is to do a terraform init and modify the downloaded module eks_cluster's auth.tf to set the host arg of the kubernetes provider to the dummy url.

vim .terraform/modules/eks_cluster/auth.tf

--- a/auth.tf
+++ b/auth.tf
@@ -94,7 +94,7 @@ provider "kubernetes" {
   # so we can proceed with the task of creating or destroying the cluster.
   #
   # If this solution bothers you, you can disable it by setting var.dummy_kubeapi_server = null
-  host                   = local.enabled ? coalesce(aws_eks_cluster.default[0].endpoint, var.dummy_kubeapi_server) : var.dummy_kubeapi_server
+  host                   = var.dummy_kubeapi_server

Proposal

Instead, it would be nice if we could either detect that the endpoint returns localhost and use something else that won't fail the kubeconfig, or disable the kubernetes provider completely when the endpoint is localhost.

The text was updated successfully, but these errors were encountered:

Nuru · 2022-02-22T19:56:05Z

@nitrocode Please look into why aws_eks_cluster.default[0].endpoint is returning localhost. That seems like a bug. The cluster endpoint should be a Kubernetes master node, which we should never be running on, so it should never be localhost, right?

snooyen · 2022-02-24T20:47:20Z

This is causing issues when attempting to delete EKS clusters as well.

$ atmos terraform destroy eks -s tenant-uw2-dev
.
.
Executing command:
/usr/bin/terraform destroy -var-file tenant-uw2-dev-eks.terraform.tfvars.json
.
.
│ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp 127.0.0.1:80: connect: connection refused
│
│   with module.eks_cluster.kubernetes_config_map.aws_auth[0],
│   on .terraform/modules/eks_cluster/auth.tf line 132, in resource "kubernetes_config_map" "aws_auth":
│  132: resource "kubernetes_config_map" "aws_auth" {
│
╵
Releasing state lock. This may take a few moments...
exit status 1

michaelkoro · 2022-03-24T11:26:28Z

This happens to me as well when running module version 0.45.0, but without any subnet changes.
I suspect maybe the config map is refreshed before the eks cluster when running the plan/apply command,
because in our state file, aws_eks_cluster.default[0].endpoint shows the real kubernetes endpoint and not localhost.

vsimon · 2022-04-03T04:01:19Z

I also have module version 0.44.0 currently and I also am getting something similar when updating the module version then planning module version 0.45.0.

terraform plan

│ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth": dial tcp [::1]:80: connect: connection refused
│ 
│   with module.eks_cluster.kubernetes_config_map.aws_auth_ignore_changes[0],
│   on .terraform/modules/eks_cluster/auth.tf line 115, in resource "kubernetes_config_map" "aws_auth_ignore_changes":
│  115: resource "kubernetes_config_map" "aws_auth_ignore_changes" {

michaelkoro · 2022-04-04T10:15:15Z

I managed to kinda bypass the issue.
I removed the config map from the state, and told the module to not create the config map resource,
I Then created an external config map resource with my own provider configuration (imported it back into the state) -

provider "kubernetes" {
  token                  = data.aws_eks_cluster_auth.eks.token
  host                   = module.eks_cluster.eks_cluster_endpoint
  cluster_ca_certificate = base64decode(module.eks_cluster.eks_cluster_certificate_authority_data)
}

resource "kubernetes_config_map" "aws_auth" {
  metadata {
    name      = "aws-auth"
    namespace = "kube-system"
  }
  data = {
    mapRoles = replace(yamlencode(distinct(var.map_additional_iam_roles)), "\"", "")
  }
  depends_on = [module.eks_cluster]
  lifecycle {
    ignore_changes = [data["mapRoles"]]
  }
}

This way I have more control over the provider version, which I suspect is causing issues.
I remember when the kubernetes provider version 2 first came out, this error occurred quite a lot.
At least for now, this workaround managed to help.

Nuru · 2022-05-17T00:29:55Z

There are various problems caused by the fact that we are calling the API of a resource that is being created or deleted at the same time. The official recommendation from Hashicorp is to break this module up into multiple modules, one to create the EKS cluster, one to create the aws-auth ConfigMap, and one to attach the worker nodes to the cluster. You can effectively do that by doing what @michaelkoro did minus the import back into this module.

Recommended workarounds

This module provides 3 different authentication mechanisms to help work around the issues. We generally recommend using kube_exec_auth_enabled when possible. When deleting the cluster, you can use kubeconfig_path and kubeconfig_path_enabled to provide a dummy configuration if needed.
See version 0.42.0 Release Notes for details.

@nitrocode Providing a KUBECONFIG via kubeconfig_path is documented as being required for importing resources. This is due to a limitation of how Terraform initializes providers when doing imports.

Nuru · 2022-05-18T20:12:22Z

Duplicate of #104

nitrocode added the bug 🐛 An issue with the system label Jan 27, 2022

korenyoni changed the title ~~The localhost url issue~~ outputs.eks_cluster_endpoint returning localhost in some cases Jan 27, 2022

nitrocode changed the title ~~outputs.eks_cluster_endpoint returning localhost in some cases~~ Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth" Jan 27, 2022

Nuru mentioned this issue May 18, 2022

V2/SemVer updates #150

Merged

Nuru marked this as a duplicate of #104 May 18, 2022

Nuru pinned this issue May 18, 2022

Nuru closed this as completed in #150 May 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth" #143

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth" #143

nitrocode commented Jan 27, 2022 •

edited

Nuru commented Feb 22, 2022

snooyen commented Feb 24, 2022

michaelkoro commented Mar 24, 2022

vsimon commented Apr 3, 2022 •

edited

michaelkoro commented Apr 4, 2022 •

edited

Nuru commented May 17, 2022

Nuru commented May 18, 2022

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth" #143

Error: Get "http://localhost/api/v1/namespaces/kube-system/configmaps/aws-auth" #143

Comments

nitrocode commented Jan 27, 2022 • edited

Describe the Bug

Workaround 1

Workaround 2

Proposal

Nuru commented Feb 22, 2022

snooyen commented Feb 24, 2022

michaelkoro commented Mar 24, 2022

vsimon commented Apr 3, 2022 • edited

michaelkoro commented Apr 4, 2022 • edited

Nuru commented May 17, 2022

Recommended workarounds

Nuru commented May 18, 2022

nitrocode commented Jan 27, 2022 •

edited

vsimon commented Apr 3, 2022 •

edited

michaelkoro commented Apr 4, 2022 •

edited