Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Logs with exception connection to api server #379

Open
cjabrantes opened this issue Feb 5, 2024 · 5 comments
Open

Logs with exception connection to api server #379

cjabrantes opened this issue Feb 5, 2024 · 5 comments

Comments

@cjabrantes
Copy link

cjabrantes commented Feb 5, 2024

Hi all,

I would like your help to confirm the following problem.
When using your plugin to enrich logs with k8s info i m getting constantly the following log:

: #0 [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 443

With a tcpdump i could see:

15:42:24.472631 IP6 ::1.39662 > ::1.6443: Flags [S], seq 3311054043, win 43690, options [mss 65476,sackOK,TS val 6687833 ecr 0,nop,wscale 7], length 0
15:42:24.472647 IP6 ::1.6443 > ::1.39662: Flags [R.], seq 0, ack 3311054044, win 0, length 0
15:42:24.472703 IP 127.0.0.1.49298 > 127.0.0.1.6443: Flags [S], seq 3130128353, win 43690, options [mss 65495,sackOK,TS val 6687833 ecr 0,nop,wscale 7], length 0
15:42:24.472717 IP 127.0.0.1.6443 > 127.0.0.1.49298: Flags [R.], seq 0, ack 3130128354, win 0, length 0

kubernetes_url have https://k8s-master.mycluster.pt:6443.

I even add log.debug "url - #{@kubernetes_url}"

    def create_client()
      log.debug 'Creating K8S client'
      log.debug "url -  #{@kubernetes_url}"
      @client = nil
      @client = Kubeclient::Client.new(
        @kubernetes_url,
        @apiVersion,
        ssl_options: @ssl_options,
        auth_options: @auth_options,
        timeouts: {
          open: @open_timeout,
          read: @read_timeout
        },
        as: :parsed_symbolized
      )
    end

And confirm:

2024-02-05 15:40:11 +0000 [debug]: #0 [filter_kube_metadata] Creating K8S client
2024-02-05 15:40:11 +0000 [debug]: #0 [filter_kube_metadata] url -  https://k8s-master.mycluster.pt:6443
2024-02-05 15:40:11 +0000 [info]: fluent/log.rb:362:info: adding match pattern="**" type="redis_store"
2024-02-05 15:40:11 +0000 [trace]: #0 fluent/log.rb:319:trace: registered output plugin 'redis_store'
2024-02-05 15:40:11 +0000 [info]: fluent/log.rb:362:info: adding source type="systemd"
2024-02-05 15:40:11 +0000 [info]: #0 [filter_kube_metadata] Exception encountered parsing namespace watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the namespace watcher.failed to connect: Connection refused - connect(2) for nil port 6443
2024-02-05 15:40:11 +0000 [info]: #0 [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 6443

But i can also see it working
2024-02-05 15:40:11 +0000 [trace]: #0 [filter_kube_metadata] raw metadata for central/envoy....

Seems that besides the connection to kubernetes_url it also tries to connect to local host in ipv4 and ipv6.

Do you have any idea of this issue?

fluentd 1.16.3
fluent-plugin-kubernetes_metadata_filter (3.4.0)
kubeclient (4.11.0)

Thanks,
Carlos

@jcantrill
Copy link
Contributor

Why do you need to provide a kubernetes_url in lieu if the plugin using the "well-known" service endpoint?

kubernetes_url have https://k8s-master.mycluster.pt:6443/.

This says port 6443 but the error message says port 443:

: #0 [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 443

Looks to me as there is a dependency here that needs to be resolved

@cjabrantes
Copy link
Author

cjabrantes commented Feb 6, 2024

I just set FLUENT_FILTER_KUBERNETES_URL for testing purposes.

Sorry the logs came from 2 different moments/configs, so when i set KUBERNETES_URL to some fqdn:6443 i can see:
[filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 6443

when i dont set and so it takes it from KUBERNETES_SERVICE_PORT (443) and KUBERNETES_SERVICE_HOST
#0 [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 443

It seems (i can see it in tcpdump) that 3 connections are tried (assuming port 443) ::1:443 , 127.0.0.1:443 and the real url (given by KUBERNETES_SERVICE_HOST or FLUENT_FILTER_KUBERNETES_URL), only the in kubernetes_url succeeds .

I can see in kibana information like pod labels, also in trace i can see that fluentd fetches data from api-server.

Could it be same problem/config with the kubeclient itself? Since kubernetes_url is not localhost i can't see how from kubernetes_metadata_filter plugin a connection can be requested to ::1 or 127.0.0.1

@jcantrill
Copy link
Contributor

I just set FLUENT_FILTER_KUBERNETES_URL for testing purposes.

I don't recognize this environment variable as anything that has ever been honored by this plugin

Sorry the logs came from 2 different moments/configs, so when i set KUBERNETES_URL to some fqdn:6443 i can see: [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 6443

when i dont set and so it takes it from KUBERNETES_SERVICE_PORT (443) and KUBERNETES_SERVICE_HOST #0 [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 443

Are you running this test from inside the cluster? If not then you likely need to additionally provide certificates to be able to talk to the API server. The plugin relies upon the kubeclient which discovers the URL and certs based upon the well known locations of these artifacts. The scheduler will mount the CA and a token for the pod serviceaccount into the pod.

@cjabrantes
Copy link
Author

I just set FLUENT_FILTER_KUBERNETES_URL for testing purposes.

I don't recognize this environment variable as anything that has ever been honored by this plugin

From the conf file:
kubernetes_url "#{ENV['FLUENT_FILTER_KUBERNETES_URL'] || 'https://' + ENV.fetch('KUBERNETES_SERVICE_HOST') + ':' + ENV.fetch('KUBERNETES_SERVICE_PORT') + '/api'}"

So i guess the important is that that env var will end up in kubernetes_url

Sorry the logs came from 2 different moments/configs, so when i set KUBERNETES_URL to some fqdn:6443 i can see: [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 6443
when i dont set and so it takes it from KUBERNETES_SERVICE_PORT (443) and KUBERNETES_SERVICE_HOST #0 [filter_kube_metadata] Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused - connect(2) for nil port 443

Are you running this test from inside the cluster? If not then you likely need to additionally provide certificates to be able to talk to the API server. The plugin relies upon the kubeclient which discovers the URL and certs based upon the well known locations of these artifacts. The scheduler will mount the CA and a token for the pod serviceaccount into the pod.

Yes, its running inside the cluster and as i mention the plugin is able to fetch the data from the api-server, i can confirm because i see the kibana logs with enriched data like labels and because if in trace level i can see the answer from api-server.

But is also having this exceptions:
Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused

From tcpdump the plugging tries to connect to 127.0.0.1 to 443 or 6443, to ::1 to 443 or 6443 and to kubernetes_url, the last one with success. (i m not aware of anything else in the node making call to localhost to port 443 or 6443)
The 443 or 6443 seems to be dependent to the port in kubernetes_url.

Any ideas?

@jcantrill
Copy link
Contributor

Yes, its running inside the cluster and as i mention the plugin is able to fetch the data from the api-server, i can confirm because i see the kibana logs with enriched data like labels and because if in trace level i can see the answer from api-server.
But is also having this exceptions:
Exception encountered parsing pod watch event. The connection might have been closed. Sleeping for 1 seconds and resetting the pod watcher.failed to connect: Connection refused
From tcpdump the plugging tries to connect to 127.0.0.1 to 443 or 6443, to ::1 to 443 or 6443 and to kubernetes_url, the last one with success. (i m not aware of anything else in the node making call to localhost to port 443 or 6443)
The 443 or 6443 seems to be dependent to the port in kubernetes_url.

I have no comments. . This particular bit of code has not changed in a long time AFAIK. We have not updated the kube client in a while either so maybe there is a version mismatch or something there that is starting to introduce issues. This plugin has been used as part of OpenShift Logging without reported issue but we always set the url (probably unnecessarily) to the well-known service name for the api server

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants