Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra wrap of the exec commands by virt-probe in liveness/readiness probes in vm/vmi, booted from DataVolume. #11755

Open
romanprog opened this issue Apr 20, 2024 · 2 comments

Comments

@romanprog
Copy link

What happened:
liveness/readiness probes with the exec command do not work if the machine boots with DataVolume.

$ kubectl get vmi
NAME                       AGE   PHASE     IP           NODENAME             READY
vmi-fedora                 15m   Running   10.3.0.127   kube-virt-master-1   False

$ kubectl get pod
NAME                             READY   STATUS    RESTARTS   AGE
virt-launcher-vmi-fedora-6wnrp   1/2     Running   0          14m

$ kubectl describe  pod virt-launcher-vmi-fedora-6wnrp
...
  Warning  Unhealthy       16m   kubelet            Readiness probe failed: {"component":"virt-probe","level":"fatal","msg":"Failed executing the command","pos":"virt-probe.go:71","reason":"rpc error: code = Unknown desc = virError(Code=1, Domain=10, Message='internal error: unable to execute QEMU agent command 'guest-exec': Guest agent command failed, error was 'Failed to execute child process “virt-probe” (No such file or directory)'')","timestamp":"2024-04-20T08:50:58.153726Z"}
panic: Failed executing the command

goroutine 1 [running]:
kubevirt.io/client-go/log.FilteredLogger.Critical({{0x117c9c0, 0xc0001e9650}, {0xfddaf8, 0xa}, 0x0, 0x0, 0x2, 0x2, {0x117d280, 0xc00011c050}}, ...)
  staging/src/kubevirt.io/client-go/log/log.go:334 +0x189
main.main()

What you expected to happen:
The test was expected to be successful. The check works if the partition type is not dataVolume (for example, containerDisk).

How to reproduce it (as minimally and precisely as possible):

  1. Deploy DataVolume and VirtualMachine (or VirtualMachineInstance).
  2. View the VirtualMachine code before the DataVolume is prvissioned.
$ kubectl get dv
NAME                                       PHASE             PROGRESS   RESTARTS   AGE
probe-fedora-disk                          ImportScheduled   N/A                   66s
$ kubectl get vmi vmi-fedora -o yaml
...
  readinessProbe:
    exec:
      command:
      - virt-probe
      - --domainName
      - demo_vmi-fedora
      - --timeoutSeconds
      - "5"
      - --command
      - cat
      - --
      - /tmp/healthy.txt
    failureThreshold: 10
    initialDelaySeconds: 20
    periodSeconds: 10
    successThreshold: 1
...
  1. View the VirtualMachine code when the DataVolume is prvissioned and vmi is booted.
$ kubectl get dv
NAME                                       PHASE             PROGRESS   RESTARTS   AGE
probe-fedora-disk                          ImportScheduled   N/A                   66s
$ kubectl get vmi vmi-fedora -o yaml
...
  readinessProbe:
    exec:
      command:
      - virt-probe
      - --domainName
      - demo_vmi-fedora
      - --timeoutSeconds
      - "5"
      - --command
      - virt-probe
      - --
      - --domainName
      - demo_vmi-fedora
      - --timeoutSeconds
      - "5"
      - --command
      - cat
      - --
      - /tmp/healthy.txt
    failureThreshold: 10
    initialDelaySeconds: 20
    periodSeconds: 10
    successThreshold: 1
    timeoutSeconds: 5
...

Environment:

  • KubeVirt version (use virtctl version):
$ virtctl version
Client Version: version.Info{GitVersion:"v1.2.0", GitCommit:"f26e45d99ac35743fc33d6a121b629e9a9af6b63", GitTreeState:"clean", BuildDate:"2024-03-05T20:34:24Z", GoVersion:"go1.21.5 X:nocoverageredesign", Compiler:"gc", Platform:"linux/amd64"}
  • Kubernetes version (use kubectl version):
Client Version: v1.29.1
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.28.5+k3s1
  • VMI and DataVolume specifications:
apiVersion: kubevirt.io/v1
kind: VirtualMachine
metadata:
  name: fedora-base-img
  namespace: demo
spec:
  running: true
  template:
    spec:
      networks:
      - name: vpc_net_0
        multus:
          default: true
          networkName: default/ovn-demo
      domain:
        devices:
          interfaces:
            - name: vpc_net_0
              bridge: {}
          disks:
          - disk: 
              bus: virtio
            name: root-volume
          - name: cloudinitdisk
            disk:
              bus: virtio
          rng: {}
        cpu:
          cores: 1
        memory:
          guest: 4G
      readinessProbe:
        exec:
          command: ["cat", "/tmp/healthy.txt"]
        failureThreshold: 10
        initialDelaySeconds: 20
        periodSeconds: 10
        timeoutSeconds: 5
      terminationGracePeriodSeconds: 60
      volumes:
      - dataVolume:
          name: fedora-base-img
        name: root-volume
      - name: cloudinitdisk
        cloudInitNoCloud:
          userData: |-
            #cloud-config
            chpasswd: { expire: False }
---
apiVersion: cdi.kubevirt.io/v1beta1
kind: DataVolume
metadata:
  name: fedora-base-img
spec:
  pvc:
    accessModes:
    - ReadWriteOnce
    resources:
      requests:
        storage: 20G
    storageClassName: local-path
  source:
    registry:
      url: "docker://quay.io/containerdisks/fedora"
  • Cloud provider or hardware configuration: k3s on bare metal servers
@aburdenthehand
Copy link
Contributor

/sig storage

@aburdenthehand
Copy link
Contributor

/cc @awels
(Since the sig/storage label didn't seem to notify)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants