ECONNRESET error from kubernetes watch after some minutes. #1496

jimjaeger · 2024-01-02T10:31:03Z

Describe the bug
If I use the kubernetes watch to listen to resource changes I get ECONNRESET and the watch stops.
Is there any chance that the watch can handle underlaying connections errors and restart on his own?

** Client Version **
0.20.0

To Reproduce
Steps to reproduce the behavior:

start a watch and wait longer than the setTimeout or setKeepAlive setting in the Watch config.

Expected behavior
A watch runs without connection issues.

** Example Code**

function waitForPodCompletion(log: Context['log'], k8sConfig: KubeConfig, podNamespace: string, resourceVersion?: string, jobName?: string): Promise<V1Pod> {
 let lastResourceVersion = resourceVersion;
  return new Promise<V1Pod>((resolve, reject) => {
    const watch = new Watch(k8sConfig);
    const queryParams: { labelSelector: string, resourceVersion?: string } = { labelSelector: `job-name=${jobName}` };
    if (resourceVersion) {
      queryParams.resourceVersion = resourceVersion;
    }

    watch.watch(`/api/v1/namespaces/${podNamespace}/pods`, queryParams, (eventType, pod: V1Pod) => {
      lastResourceVersion = pod.metadata?.resourceVersion;
      // log.info("WATCH RESULT" + JSON.stringify(pod));
      if (eventType === 'ADDED' && pod.metadata?.name) {
        log.info(`Job pod ${pod.metadata.name} ${pod.metadata?.resourceVersion} added.`);
      }
      if (eventType === 'MODIFIED' && pod.metadata?.name) {
        log.info(`Job pod ${pod.metadata.name} status: ${pod.status?.phase}, resourceVersion: ${pod.metadata?.resourceVersion}.`);
        if (pod.status?.phase === 'Succeeded') {
          //log.info("WATCH RESULT" + JSON.stringify(pod));
          resolve(pod);
        } else if (pod.status?.phase === 'Failed') {
          reject(new Error(`Job failed. Pod ${pod.metadata.name} status: ${pod.status.phase} startTime: ${pod.status.startTime}.`));
        }
      }
    }, (error: { code: string, message: string, stack: string }) => {
      // strange, here I get "null" call, short after the ECONNRESET
      if (error){
        reject(error);
      }
    })
  }).catch(onrejected => {
    if (onrejected && onrejected.code == 'ECONNRESET') {
      log.info(`Restart Watch with ${lastResourceVersion}.`);
      return waitForPodCompletion(log, k8sConfig, podNamespace, lastResourceVersion, jobName);
    } else {
      throw onrejected;
    }
  })

Environment (please complete the following information):

OS: Windows
NodeJS Versionv20.10.0
Cloud runtime Redhat OpenShift

The text was updated successfully, but these errors were encountered:

brendandburns · 2024-01-02T17:15:14Z

A watch is tied to a single TCP stream, so when it is broken you need to start a new watch (and you need to re-list also in case you missed something)

The informer class encapsulates this logic and is probably what you are looking for:
https://github.com/kubernetes-client/javascript/blob/master/src/informer.ts

(fwiw, wrt the "informer" name, I think it's confusing, but it got established as the standard name within the go client library, so we use it here too for consistency.)

jimjaeger · 2024-01-02T17:18:20Z

Thanks for the information. But the informer class has the same problem. The informer also throw the inner connection errors.

jobcespedes · 2024-03-06T04:08:00Z

Same issue here with informer. Tried workaround of periodically starting the informer as suggested in #596. Nonetheless, a new issue was hit (see #1598)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ECONNRESET error from kubernetes watch after some minutes. #1496

ECONNRESET error from kubernetes watch after some minutes. #1496

jimjaeger commented Jan 2, 2024

brendandburns commented Jan 2, 2024

jimjaeger commented Jan 2, 2024 •

edited

jobcespedes commented Mar 6, 2024

ECONNRESET error from kubernetes watch after some minutes. #1496

ECONNRESET error from kubernetes watch after some minutes. #1496

Comments

jimjaeger commented Jan 2, 2024

brendandburns commented Jan 2, 2024

jimjaeger commented Jan 2, 2024 • edited

jobcespedes commented Mar 6, 2024

jimjaeger commented Jan 2, 2024 •

edited