Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The kubeconfig environment variable is sometimes not correctly referenced #2436

Open
3 tasks done
gitfxx opened this issue Apr 29, 2024 · 7 comments
Open
3 tasks done

Comments

@gitfxx
Copy link

gitfxx commented Apr 29, 2024

Contributing guidelines

I've found a bug and checked that ...

  • ... the documentation does not mention anything about my problem
  • ... there are no open or closed issues that are related to my problem

Description

I need to create architecture-specific builders in two separate clusters, and this is how I do it

KUBECONFIG=/k8s-config-x86 buildx create --name builder-965284f6-b605-4a5c-87a6-d962909a43bc --node amd64-1783674487699423233-9652 '--platform=linux/amd64' --driver kubernetes --driver-opt 'namespace=buildx-builder' --use
KUBECONFIG=/k8s-config-arm buildx create --append --name builder-965284f6-b605-4a5c-87a6-d962909a43bc --node arm64-1783674487699423233-9652 '--platform=linux/arm64' --driver kubernetes --driver-opt 'namespace=buildx-builder' --use

This approach usually works and meets expectations, but sometimes an ARM pod gets launched in an x86 cluster, suggesting that the specified KUBECONFIG might not have been used. I'm not sure whether this is a bug or an issue with how I'm using it.

Expected behaviour

Run buildkit pod on clusters with different architectures.

Actual behaviour

Sometimes it occurs that an ARM buildkit pod is run on an x86 cluster.

Buildx version

github.com/docker/buildx v0.7.0 f002608

Docker info

no

Builders list

none

Configuration

none

Build logs

No response

Additional info

No response

@tonistiigi
Copy link
Member

Commands look correct to me. cc @AkihiroSuda

@gitfxx
Copy link
Author

gitfxx commented May 10, 2024

image
x86 cluster
image

@gitfxx
Copy link
Author

gitfxx commented May 10, 2024

@tonistiigi I suspect that because the pod is not yet ready, executing buildx build will boot the pod, but there's a chance that the pod launched at this point will end up on the wrong cluster. Could there be an issue with referencing the kubeconfig file in this context?
Before creating the builder, I will use the 'kubectl' command to create the corresponding BuildKit deployment, as follows:

KUBECONFIG=/k8s-config-x86 kubectl apply -f amd64-1.yaml -n buildx-builder --wait --timeout=600s   

KUBECONFIG=/k8s-config-x86 buildx create  --name builder-e72e8a7a-9e24-4b50-873f-fc305b7e62cb --node amd64-1788042013837361154-e72e --platform=linux/amd64 --driver kubernetes --driver-opt namespace=buildx-builder  --use  
 
KUBECONFIG=/k8s-config-arm kubectl apply -f arm64-1.yaml -n buildx-builder --wait --timeout=600s   

KUBECONFIG=/k8s-config-arm buildx create  --append --name builder-e72e8a7a-9e24-4b50-873f-fc305b7e62cb --node  arm64-1788042013837361154-e72e --platform=linux/arm64 --driver kubernetes --driver-opt namespace=buildx-builder  --use   

buildx build --builder=builder-e72e8a7a-9e24-4b50-873f-fc305b7e62cb \
--platform=linux/amd64,linux/arm64 \
-t test.image.cn/test:1.0 \
-f ./Dockerfile . 

@gitfxx
Copy link
Author

gitfxx commented May 16, 2024

"I have reproduced the issue and attempted to resolve it using this approach. Currently, I have verified that the problem has been fixed."
@tonistiigi @crazy-max @AkihiroSuda

func ConfigFromEndpoint(endpointName string, s store.Reader) (clientcmd.ClientConfig, error) {
	if strings.HasPrefix(endpointName, "kubernetes://") {
		u, _ := url.Parse(endpointName)
		kubeconfig := ""
		if kubeconfig := u.Query().Get("kubeconfig"); kubeconfig != "" {
			_ = os.Setenv(clientcmd.RecommendedConfigPathEnvVar, kubeconfig)
		}
		rules := clientcmd.NewDefaultClientConfigLoadingRules()
		/*
			If the content retrieved from the current environment variable is inconsistent with the obtained one,
			it might indicate the presence of multiple kubeconfig files.
			In such cases, it's advisable to directly retrieve the configuration from the file."
		*/
		if os.Getenv(clientcmd.RecommendedConfigPathEnvVar) != kubeconfig {
			logrus.Debug("using kube config from file")
			rules.ExplicitPath = kubeconfig
		}
		apiConfig, err := rules.Load()
		if err != nil {
			return nil, err
		}
		return clientcmd.NewDefaultClientConfig(*apiConfig, &clientcmd.ConfigOverrides{}), nil
	}
	return ConfigFromContext(endpointName, s)
}

@crazy-max
Copy link
Member

github.com/docker/buildx v0.7.0 f002608

Quite an old release, do you repro on latest stable v0.14.0 as well?

Can you also show the output of docker buildx inspect <name>. It should display the kubeconfig used in Endpoint field for each node like: https://github.com/docker/buildx/actions/runs/9079420760/job/24948667867#step:9:29

Name:          buildx-test-4c972a3f9d369614b40f28a281790c7e
Driver:        kubernetes
Last Activity: 2024-05-14 12:36:40 +0000 UTC

Nodes:
Name:                  buildx-test-4c972a3f9d369614b40f28a281790c7e0
Endpoint:              kubernetes:///buildx-test-4c972a3f9d369614b40f28a281790c7e?deployment=buildkit-4c2ed3ed-970f-4f3d-a6df-a4fcbab4d5cf-d9d73&kubeconfig=%2Ftmp%2Finstall-k3s-action%2Fkubeconfig.yaml
Driver Options:        image="moby/buildkit:buildx-stable-1" qemu.install="true"
Status:                running
BuildKit daemon flags: --allow-insecure-entitlement=network.host
BuildKit version:      v0.13.2
Platforms:             linux/amd64*

@gitfxx
Copy link
Author

gitfxx commented May 16, 2024

github.com/docker/buildx v0.7.0 f002608

Quite an old release, do you repro on latest stable v0.14.0 as well?

Can you also show the output of docker buildx inspect <name>. It should display the kubeconfig used in Endpoint field for each node like: https://github.com/docker/buildx/actions/runs/9079420760/job/24948667867#step:9:29

Name:          buildx-test-4c972a3f9d369614b40f28a281790c7e
Driver:        kubernetes
Last Activity: 2024-05-14 12:36:40 +0000 UTC

Nodes:
Name:                  buildx-test-4c972a3f9d369614b40f28a281790c7e0
Endpoint:              kubernetes:///buildx-test-4c972a3f9d369614b40f28a281790c7e?deployment=buildkit-4c2ed3ed-970f-4f3d-a6df-a4fcbab4d5cf-d9d73&kubeconfig=%2Ftmp%2Finstall-k3s-action%2Fkubeconfig.yaml
Driver Options:        image="moby/buildkit:buildx-stable-1" qemu.install="true"
Status:                running
BuildKit daemon flags: --allow-insecure-entitlement=network.host
BuildKit version:      v0.13.2
Platforms:             linux/amd64*

"Yes, the issue persists in version 0.14 as well."

@gitfxx
Copy link
Author

gitfxx commented May 16, 2024

"The reason for this issue is that when using multiple kubeconfig files, the environment variable settings may be overwritten by subsequent settings, resulting in an incorrect ClientConfig being returned."
image

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants