Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add blog post for PodHasNetwork condition #36197

Merged
merged 1 commit into from
Sep 5, 2022
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
123 changes: 123 additions & 0 deletions content/en/blog/_posts/2022-09-14-pod-has-network-condition.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,123 @@
---
layout: blog
title: 'Kubernetes 1.25: PodHasNetwork condition for pods'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit:

Suggested change
title: 'Kubernetes 1.25: PodHasNetwork condition for pods'
title: 'Kubernetes 1.25: PodHasNetwork Condition for Pods'

date: 2022-09-14
slug: pod-has-network-condition
---

**Author:**
Deep Debroy (Apple)

Kubernetes 1.25 introduces Alpha support for a new kubelet-managed pod condition
in the status field of a pod: `PodHasNetwork`. The kubelet, for a worker node,
will use the `PodHasNetwork` condition to accurately surface the initialization
state of a pod from the perspective of pod sandbox creation and network
configuration by a container runtime (typically in coordination with CNI
plugins). The kubelet starts to pull container images and start individual
containers (including init containers) after the status of the `PodHasNetwork`
condition is set to `True`. Metrics collection services that report latency of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit)

Suggested change
condition is set to `True`. Metrics collection services that report latency of
condition is set to `"True"`. Metrics collection services that report latency of

Conditions are a string value used like a ternary enum.

pod initialization from a cluster infrastructural perspective (i.e. agnostic of
per container characteristics like image size or payload) can utilize the
`PodHasNetwork` condition to accurately generate Service Level Indicators
(SLIs). Certain operators or controllers that manage underlying pods may utilize
the `PodHasNetwork` condition to optimize the set of actions performed when pods
repeatedly fail to come up.

### How is this different from the existing Initialized condition reported for pods?

The kubelet sets the status of the existing `Initialized` condition reported in
the status field of a pod depending on the presence of init containers in a pod.

If a pod specifies init containers, the status of the `Initialized` condition in
the pod status will not be set to `True` until all init containers for the pod
have succeeded. However, init containers, configured by users, may have errors
(payload crashing, invalid image, etc) and the number of init containers
configured in a pod may vary across different workloads. Therefore,
cluster-wide, infrastructural SLIs around pod initialization cannot depend on
the `Initialized` condition of pods.

If a pod does not specify init containers, the status of the `Initialized`
condition in the pod status is set to `True` very early in the lifecycle of the
pod. This occurs before the kubelet initiates any pod runtime sandbox creation
and network configuration steps. As a result, a pod without init containers will
report the status of the `Initialized` condition as `True` even if the container
runtime is not able to successfully initialize the pod sandbox environment.

Relative to either situation above, the `PodHasNetwork` condition surfaces more
accurate data around when the pod runtime sandbox was initialized with
networking configured so that the kubelet can proceed to launch user-configured
containers (including init containers) in the pod.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Additionally, some cloud providers may attach additional network interface(s) to the network namespace of a pod, via CNI, some time after the pod is up and running with a loopback interface. In this case the `PodHasNetwork` condition may not reflect whether all network interfaces of the pod are initialized.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you think this needs more clarification in the general context for most pod lifecycles .. feel free to leave it out.. we can discuss further

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this special behavior strictly something that happens in the cloud? (could an on-prem cluster do something similar?)

If we're adding text here, we could make it clear that PodHasNetwork means that for each configured address family, the kubelet sees that [the one IP address that a conformant network plugin should set up for that address family] is configured and up.

Would need better wording than mine, mind!

Copy link
Member Author

@ddebroy ddebroy Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a brief sync about this with Mike yesterday. My understanding is what Mike described above could be possible if some other (privileged) node agent aside from the Kubelet is configuring things in the network namespace of the pod out-of-band from Kubelet's pod sandbox bring-up and network configuration through CRI (i.e. outside of the Kubelet runtime => CRI => Runtime (containerd/crio/etc) => CNI => CNI plugins flow). I will make a note of this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: CRI (or wider - cloud providers, as suggested by @mikebrow) can do anything it/they want at any point of pod lifecycle. it can call a cni plugin, or multiple of them, or whatever it wants. but that's out of the scope of the description of "kubernetes contract".
i didn't yet red the mentioned KEP so i'm not touching the topic how the data should be explained, but looking on @mikebrow description i would expect that PodHasNetwork will be only describing status of the default network connectivity, ignoring all "whathever will cri call/setup" things which are not the default access to kubeapi network things (like e.g. what is hold in extensions under the https://github.com/k8snetworkplumbingwg/ meaning, or more precisely under the https://docs.google.com/document/d/1Ny03h6IDVy_e_vmElOqR7UdTPAG_RNydhVE1Kx54kFQ/edit#heading=h.hylsbqoj5fxd extension)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the inputs and clarifications about this, @mikebrow, @sftim and @jellonek. I will summarize and call out the above point.

Copy link
Member Author

@ddebroy ddebroy Sep 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified this. For reference, https://github.com/maiqueb/multus-dynamic-networks-controller appears to be a concrete PoC implementation of what was being referred to above.

Note that a node agent may dynamically re-configure network interface(s) for a
pod by watching changes in pod annotations that specify additional networking
configuration (e.g. `k8s.v1.cni.cncf.io/networks`). Dynamic updates of pod
networking configuration after the pod sandbox is initialized by Kubelet (in
coordination with a container runtime) are not reflected by the `PodHasNetwork`
condition.

### Try out the PodHasNetwork condition for pods

In order to have the kubelet report the `PodHasNetwork` condition in the status
field of a pod, please enable the `PodHasNetworkCondition` feature gate on the
kubelet.

For a pod whose runtime sandbox has been successfully created and has networking
configured, the kubelet will report the `PodHasNetwork` condition with status set to `True`:

```
$ kubectl describe pod nginx1
Name: nginx1
Namespace: default
...
Conditions:
Type Status
PodHasNetwork True
Initialized True
Ready True
ContainersReady True
PodScheduled True
```

For a pod whose runtime sandbox has not been created yet (and networking not
configured either), the kubelet will report the `PodHasNetwork` condition with
status set to `False`:

```
$ kubectl describe pod nginx2
Name: nginx2
Namespace: default
...
Conditions:
Type Status
PodHasNetwork False
Initialized True
Ready False
ContainersReady False
PodScheduled True
```

### What’s next?

Depending on feedback and adoption, the Kubernetes team plans to push the
reporting of the `PodHasNetwork` condition to Beta in 1.26 or 1.27.

### How can I learn more?

Please check out the
[documentation](/docs/concepts/workloads/pods/pod-lifecycle/) for the
`PodHasNetwork` condition to learn more about it and how it fits in relation to
other pod conditions.

### How to get involved?

This feature is driven by the SIG Node community. Please join us to connect with
the community and share your ideas and feedback around the above feature and
beyond. We look forward to hearing from you!

### Acknowledgements

We want to thank the following people for their insightful and helpful reviews
of the KEP and PRs around this feature: Derek Carr (@derekwaynecarr), Mrunal
Patel (@mrunalp), Dawn Chen (@dchen1107), Qiutong Song (@qiutongs), Ruiwen Zhao
(@ruiwen-zhao), Tim Bannister (@sftim), Danielle Lancashire (@endocrimes) and
Agam Dua (@agamdua).