Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workflow stuck at running when failed to load a Git artifact #10045

Closed
3 tasks done
terrytangyuan opened this issue Nov 16, 2022 · 10 comments · Fixed by #10047
Closed
3 tasks done

Workflow stuck at running when failed to load a Git artifact #10045

terrytangyuan opened this issue Nov 16, 2022 · 10 comments · Fixed by #10047
Labels
type/bug type/regression Regression from previous behavior (a specific type of bug)

Comments

@terrytangyuan
Copy link
Member

terrytangyuan commented Nov 16, 2022

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

Pod errored:

git                                    0/2     Init:Error   0          88s

But the workflow is still running.

NAME   STATUS    AGE   MESSAGE
git    Running   90s   

  Nodes:
    Git:
      Display Name:    git
      Finished At:     <nil>
      Host Node Name:  k3d-k3s-default-server-0
      Id:              git
      Inputs:
        Artifacts:
          Git:
            Repo:      https://github.com/argoproj/argo-workflows.git
            Revision:  unknown
          Name:        git-repo
          Path:        /tmp/git
      Message:         Error (exit code 1): artifact git-repo failed to load: failed to get resolve revision: reference not found
      Name:            git
      Phase:           Pending
      Progress:        0/1
      Started At:      2022-11-16T20:53:32Z
      Template Name:   git-depth
      Template Scope:  local/git
      Type:            Pod
  Phase:               Running
  Progress:            0/1
  Started At:          2022-11-16T20:53:32Z

Version

latest

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: git
spec:
  ttlStrategy:
    secondsAfterCompletion: 3600
  podGC:
    strategy: OnWorkflowSuccess
  serviceAccountName: argo
  entrypoint: git-depth
  templates:
  - name: git-depth
    inputs:
      artifacts:
      - name: git-repo
        path: /tmp/git
        git:
          repo: https://github.com/argoproj/argo-workflows.git
          revision: unknown
    container:
      image: argoproj/argosay:v2
      command: [sh, -c]
      args: ["ls -l"]
      workingDir: /tmp/git

Logs from the workflow controller

kubectl logs -n argo deploy/workflow-controller | grep ${workflow}

time="2022-11-16T20:50:22.254Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.254Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (handler)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.254Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.261Z" level=info msg="Updated phase  -> Running" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.261Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:50:22.261Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.261Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.261Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:22.261Z" level=debug msg="Initializing node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:50:22.261Z" level=info msg="Pod node git initialized Pending" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.261Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.261Z" level=debug namespace=argo-test needLocation=false workflow=git
time="2022-11-16T20:50:22.261Z" level=debug msg="Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"argo-test\", Name:\"git\", UID:\"e4e9c2fc-753e-4721-9c00-d78fe1a6d1b9\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"787\", FieldPath:\"\"}): type: 'Normal' reason: 'WorkflowRunning' Workflow Running"
time="2022-11-16T20:50:22.262Z" level=debug msg="Creating Pod: git (git)" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.267Z" level=info msg="Created pod: git (git)" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.267Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:50:22.267Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:50:22.267Z" level=debug msg="Log changes patch: {\"metadata\":{\"annotations\":{\"workflows.argoproj.io/pod-name-format\":\"v2\"},\"labels\":{\"workflows.argoproj.io/phase\":\"Running\"}},\"status\":{\"artifactGCStatus\":{\"notSpecified\":true},\"artifactRepositoryRef\":{\"artifactRepository\":{},\"default\":true},\"nodes\":{\"git\":{\"displayName\":\"git\",\"finishedAt\":null,\"id\":\"git\",\"inputs\":{\"artifacts\":[{\"git\":{\"repo\":\"https://github.com/argoproj/argo-workflows.git\",\"revision\":\"unknown\"},\"name\":\"git-repo\",\"path\":\"/tmp/git\"}]},\"name\":\"git\",\"phase\":\"Pending\",\"startedAt\":\"2022-11-16T20:50:22Z\",\"templateName\":\"git-depth\",\"templateScope\":\"local/git\",\"type\":\"Pod\"}},\"phase\":\"Running\",\"startedAt\":\"2022-11-16T20:50:22Z\"}}"
time="2022-11-16T20:50:22.274Z" level=info msg="Workflow update successful" namespace=argo-test phase=Running resourceVersion=791 workflow=git
time="2022-11-16T20:50:32.269Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:50:32.269Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:50:32.269Z" level=info msg="node changed" namespace=argo-test new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=git old.message= old.phase=Pending old.progress=0/1 workflow=git
time="2022-11-16T20:50:32.269Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:50:32.269Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:32.269Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:32.269Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:32.269Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:50:32.269Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:50:32.269Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Pending workflow=git
time="2022-11-16T20:50:32.270Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:50:32.270Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:50:32.270Z" level=debug msg="Log changes patch: {\"status\":{\"conditions\":[{\"status\":\"False\",\"type\":\"PodRunning\"}],\"nodes\":{\"git\":{\"hostNodeName\":\"k3d-k3s-default-server-0\",\"message\":\"PodInitializing\"}}}}"
time="2022-11-16T20:50:32.278Z" level=info msg="Workflow update successful" namespace=argo-test phase=Running resourceVersion=803 workflow=git
time="2022-11-16T20:50:42.279Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:50:42.280Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:50:42.280Z" level=info msg="node unchanged" namespace=argo-test nodeID=git workflow=git
time="2022-11-16T20:50:42.280Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:50:42.280Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:42.280Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:42.280Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:42.280Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:50:42.280Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:50:42.280Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Pending workflow=git
time="2022-11-16T20:50:42.280Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:50:42.280Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:50:55.066Z" level=info msg="leaving phase un-changed: wait container is not yet terminated " namespace=argo-test new.phase=Error workflow=git
time="2022-11-16T20:50:55.066Z" level=info msg="node changed" namespace=argo-test new.message="Error (exit code 1): artifact git-repo failed to load: failed to get resolve revision: reference not found" new.phase=Pending new.progress=0/1 nodeID=git old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=git
time="2022-11-16T20:50:55.066Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:55.066Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:55.066Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:50:55.066Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Failed workflow=git
time="2022-11-16T20:50:55.066Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:50:55.066Z" level=debug msg="Log changes patch: {\"status\":{\"nodes\":{\"git\":{\"message\":\"Error (exit code 1): artifact git-repo failed to load: failed to get resolve revision: reference not found\"}}}}"
time="2022-11-16T20:50:55.072Z" level=info msg="Workflow update successful" namespace=argo-test phase=Running resourceVersion=821 workflow=git
time="2022-11-16T20:51:05.073Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:51:05.073Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:51:05.073Z" level=info msg="leaving phase un-changed: wait container is not yet terminated " namespace=argo-test new.phase=Error workflow=git
time="2022-11-16T20:51:05.073Z" level=info msg="node unchanged" namespace=argo-test nodeID=git workflow=git
time="2022-11-16T20:51:05.073Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:51:05.073Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:51:05.073Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:51:05.073Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:51:05.073Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:51:05.073Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:51:05.074Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Failed workflow=git
time="2022-11-16T20:51:05.074Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:51:05.074Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:53:32.635Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.635Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.635Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.636Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.636Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.636Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.636Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.636Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.643Z" level=info msg="Updated phase  -> Running" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.643Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:53:32.643Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.643Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.643Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:32.643Z" level=debug msg="Initializing node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:53:32.643Z" level=info msg="Pod node git initialized Pending" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.643Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.643Z" level=debug namespace=argo-test needLocation=false workflow=git
time="2022-11-16T20:53:32.643Z" level=debug msg="Event(v1.ObjectReference{Kind:\"Workflow\", Namespace:\"argo-test\", Name:\"git\", UID:\"8c38fe99-163d-425c-8332-5d891c2f9f83\", APIVersion:\"argoproj.io/v1alpha1\", ResourceVersion:\"887\", FieldPath:\"\"}): type: 'Normal' reason: 'WorkflowRunning' Workflow Running"
time="2022-11-16T20:53:32.643Z" level=debug msg="Creating Pod: git (git)" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.651Z" level=info msg="Created pod: git (git)" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.651Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:53:32.651Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:53:32.651Z" level=debug msg="Log changes patch: {\"metadata\":{\"annotations\":{\"workflows.argoproj.io/pod-name-format\":\"v2\"},\"labels\":{\"workflows.argoproj.io/phase\":\"Running\"}},\"status\":{\"artifactGCStatus\":{\"notSpecified\":true},\"artifactRepositoryRef\":{\"artifactRepository\":{},\"default\":true},\"nodes\":{\"git\":{\"displayName\":\"git\",\"finishedAt\":null,\"id\":\"git\",\"inputs\":{\"artifacts\":[{\"git\":{\"repo\":\"https://github.com/argoproj/argo-workflows.git\",\"revision\":\"unknown\"},\"name\":\"git-repo\",\"path\":\"/tmp/git\"}]},\"name\":\"git\",\"phase\":\"Pending\",\"startedAt\":\"2022-11-16T20:53:32Z\",\"templateName\":\"git-depth\",\"templateScope\":\"local/git\",\"type\":\"Pod\"}},\"phase\":\"Running\",\"startedAt\":\"2022-11-16T20:53:32Z\"}}"
time="2022-11-16T20:53:32.655Z" level=info msg="Workflow update successful" namespace=argo-test phase=Running resourceVersion=890 workflow=git
time="2022-11-16T20:53:40.939Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:53:40.940Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:53:40.940Z" level=info msg="node changed" namespace=argo-test new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=git old.message= old.phase=Pending old.progress=0/1 workflow=git
time="2022-11-16T20:53:40.940Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:53:40.940Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:40.940Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:40.940Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:40.940Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:53:40.940Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:53:40.940Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Pending workflow=git
time="2022-11-16T20:53:40.940Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:53:40.940Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:53:40.940Z" level=debug msg="Log changes patch: {\"status\":{\"conditions\":[{\"status\":\"False\",\"type\":\"PodRunning\"}],\"nodes\":{\"git\":{\"hostNodeName\":\"k3d-k3s-default-server-0\",\"message\":\"PodInitializing\"}}}}"
time="2022-11-16T20:53:40.950Z" level=info msg="Workflow update successful" namespace=argo-test phase=Running resourceVersion=902 workflow=git
time="2022-11-16T20:53:50.950Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:53:50.950Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:53:50.950Z" level=info msg="node unchanged" namespace=argo-test nodeID=git workflow=git
time="2022-11-16T20:53:50.950Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:53:50.950Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:50.950Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:50.950Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:53:50.950Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:53:50.950Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:53:50.950Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Pending workflow=git
time="2022-11-16T20:53:50.950Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:53:50.950Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:54:04.386Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:54:04.386Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:54:04.386Z" level=info msg="leaving phase un-changed: wait container is not yet terminated " namespace=argo-test new.phase=Error workflow=git
time="2022-11-16T20:54:04.386Z" level=info msg="node changed" namespace=argo-test new.message="Error (exit code 1): artifact git-repo failed to load: failed to get resolve revision: reference not found" new.phase=Pending new.progress=0/1 nodeID=git old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=git
time="2022-11-16T20:54:04.386Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:54:04.386Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:54:04.386Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:54:04.386Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:54:04.386Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:54:04.386Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:54:04.386Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Failed workflow=git
time="2022-11-16T20:54:04.386Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:54:04.386Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git
time="2022-11-16T20:54:04.387Z" level=debug msg="Log changes patch: {\"status\":{\"nodes\":{\"git\":{\"message\":\"Error (exit code 1): artifact git-repo failed to load: failed to get resolve revision: reference not found\"}}}}"
time="2022-11-16T20:54:04.391Z" level=info msg="Workflow update successful" namespace=argo-test phase=Running resourceVersion=914 workflow=git
time="2022-11-16T20:54:14.392Z" level=info msg="Processing workflow" namespace=argo-test workflow=git
time="2022-11-16T20:54:14.392Z" level=info msg="Task-result reconciliation" namespace=argo-test numObjs=0 workflow=git
time="2022-11-16T20:54:14.392Z" level=info msg="leaving phase un-changed: wait container is not yet terminated " namespace=argo-test new.phase=Error workflow=git
time="2022-11-16T20:54:14.392Z" level=info msg="node unchanged" namespace=argo-test nodeID=git workflow=git
time="2022-11-16T20:54:14.392Z" level=debug msg="Evaluating node git: template: *v1alpha1.WorkflowStep (git-depth), boundaryID: " namespace=argo-test workflow=git
time="2022-11-16T20:54:14.392Z" level=debug msg="Resolving the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:54:14.392Z" level=debug msg="Getting the template" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:54:14.392Z" level=debug msg="Getting the template by name" base="*v1alpha1.Workflow (namespace=argo-test,name=git)" depth=0 tmpl="*v1alpha1.WorkflowStep (git-depth)"
time="2022-11-16T20:54:14.392Z" level=debug msg="Executing node git of Pod is Pending" namespace=argo-test workflow=git
time="2022-11-16T20:54:14.392Z" level=debug msg="Executing node git with container template: git-depth\n" namespace=argo-test workflow=git
time="2022-11-16T20:54:14.392Z" level=debug msg="Skipped pod git (git) creation: already exists" namespace=argo-test podPhase=Failed workflow=git
time="2022-11-16T20:54:14.392Z" level=info msg="TaskSet Reconciliation" namespace=argo-test workflow=git
time="2022-11-16T20:54:14.392Z" level=info msg=reconcileAgentPod namespace=argo-test workflow=git

Logs from in your workflow's wait container

kubectl logs -n argo -c wait -l workflows.argoproj.io/workflow=${workflow},workflow.argoproj.io/phase!=Succeeded

@terrytangyuan terrytangyuan added type/bug type/regression Regression from previous behavior (a specific type of bug) labels Nov 16, 2022
@terrytangyuan
Copy link
Member Author

This was working for v3.4.0 but not working for v3.4.2+

@terrytangyuan
Copy link
Member Author

Only init container terminated:

Init Containers:
  init:
    Container ID:  containerd://e606d85bc8fd14756203a9bb61a0323a861a68456400a37f8375ffe782a64902
    Image:         argoproj/argoexec:test-oss-1
    Image ID:      sha256:85a96b0a84a070d4d6deb1c8d92c7c88e2fcbc8313a6a62b074de0c68555f30e
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      init
      --loglevel
      debug
      --log-format
      text
    State:          Terminated
      Reason:       Error
      Message:      artifact git-repo failed to load: failed to get resolve revision: reference not found
      Exit Code:    1
      Started:      Wed, 16 Nov 2022 15:53:33 -0500
      Finished:     Wed, 16 Nov 2022 15:53:53 -0500
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      git (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 git
      ARGO_CONTAINER_NAME:                init
      ARGO_TEMPLATE:                      {"name":"git-depth","inputs":{"artifacts":[{"name":"git-repo","path":"/tmp/git","git":{"repo":"https://github.com/argoproj/argo-workflows.git","revision":"unknown"}}]},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls -l"],"workingDir":"/tmp/git","resources":{}}}
      ARGO_NODE_ID:                       git
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /argo/inputs/artifacts from input-artifacts (rw)
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xqpvv (ro)
Containers:
  wait:
    Container ID:  
    Image:         argoproj/argoexec:test-oss-1
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      argoexec
      wait
      --loglevel
      debug
      --log-format
      text
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_POD_NAME:                      git (v1:metadata.name)
      ARGO_POD_UID:                        (v1:metadata.uid)
      GODEBUG:                            x509ignoreCN=0
      ARGO_WORKFLOW_NAME:                 git
      ARGO_CONTAINER_NAME:                wait
      ARGO_TEMPLATE:                      {"name":"git-depth","inputs":{"artifacts":[{"name":"git-repo","path":"/tmp/git","git":{"repo":"https://github.com/argoproj/argo-workflows.git","revision":"unknown"}}]},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls -l"],"workingDir":"/tmp/git","resources":{}}}
      ARGO_NODE_ID:                       git
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /mainctrfs/tmp/git from input-artifacts (rw,path="git-repo")
      /tmp from tmp-dir-argo (rw,path="0")
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xqpvv (ro)
  main:
    Container ID:  
    Image:         argoproj/argosay:v2
    Image ID:      
    Port:          <none>
    Host Port:     <none>
    Command:
      /var/run/argo/argoexec
      emissary
      --loglevel
      debug
      --log-format
      text
      --
      sh
      -c
    Args:
      ls -l
    State:          Waiting
      Reason:       PodInitializing
    Ready:          False
    Restart Count:  0
    Environment:
      ARGO_CONTAINER_NAME:                main
      ARGO_TEMPLATE:                      {"name":"git-depth","inputs":{"artifacts":[{"name":"git-repo","path":"/tmp/git","git":{"repo":"https://github.com/argoproj/argo-workflows.git","revision":"unknown"}}]},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls -l"],"workingDir":"/tmp/git","resources":{}}}
      ARGO_NODE_ID:                       git
      ARGO_INCLUDE_SCRIPT_OUTPUT:         false
      ARGO_DEADLINE:                      0001-01-01T00:00:00Z
      ARGO_PROGRESS_FILE:                 /var/run/argo/progress
      ARGO_PROGRESS_PATCH_TICK_DURATION:  1m0s
      ARGO_PROGRESS_FILE_TICK_DURATION:   3s
    Mounts:
      /tmp/git from input-artifacts (rw,path="git-repo")
      /var/run/argo from var-run-argo (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-xqpvv (ro)

terrytangyuan added a commit to terrytangyuan/argo-workflows that referenced this issue Nov 16, 2022
terrytangyuan added a commit to terrytangyuan/argo-workflows that referenced this issue Nov 17, 2022
…ixes argoproj#10045

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
terrytangyuan added a commit that referenced this issue Nov 20, 2022
…ixes #10045 (#10047)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
sarabala1979 pushed a commit that referenced this issue Nov 29, 2022
…ixes #10045 (#10047)

Signed-off-by: Yuan Tang <terrytangyuan@gmail.com>
Signed-off-by: Saravanan Balasubramanian <sarabala1979@gmail.com>
@arnaud-soulie
Copy link

Hi @terrytangyuan ,

I have a simple use-case which seems to be linked to this issue, still seen on v3.4.5:
The following workflow raises an error on the initContainer:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: test
spec:
  ttlStrategy:
    secondsAfterCompletion: 3600
  podGC:
    strategy: OnWorkflowSuccess
      # serviceAccountName: argo
  entrypoint: test
  templates:
  - name: test
    initContainers:
      - name: inittest
        image: argoproj/argosay:v2
        command:
          - bash
          - '-c'
          - ' . /dev/stdin <<< "return 1" '
    container:
      image: argoproj/argosay:v2
      command: [sh, -c]
      args: ["ls -l"]

I would expect the Workflow to stop and switch to error state here.

On 2 different clusters (k8s v1.25.1 & 1.25.7), my workflow is stuck in Running state although the Pod is in Init:Error state.
This happens when ArgoWorkflow is deployed using the default official helm chart (controller image: quay.io/argoproj/workflow-controller@sha256:1d7d9691080d18f066a4ccdbd3138eb735d3da1172561afa5762e58e30f9ba85).

But if I build the controller myself using the Development Container as explained in the doc, on the v3.4.5 or latest tag, my workflow switches to Error as expected.

With the Helm chart, I never get the marking node as failed since init container has non-zero exit code log.
It looks like the range pod.Status.InitContainerStatuses is not correctly updated.
Any idea ?

Thank you.

@terrytangyuan
Copy link
Member Author

Could you paste your kubectl get pod -o yaml <name>?

@arnaud-soulie
Copy link

Here it is:

apiVersion: v1
kind: Pod
metadata:
  annotations:
    kubectl.kubernetes.io/default-container: main
    workflows.argoproj.io/node-id: test
    workflows.argoproj.io/node-name: test
  creationTimestamp: "2023-03-23T09:24:41Z"
  labels:
    workflows.argoproj.io/completed: "false"
    workflows.argoproj.io/workflow: test
  name: test
  namespace: default
  ownerReferences:
  - apiVersion: argoproj.io/v1alpha1
    blockOwnerDeletion: true
    controller: true
    kind: Workflow
    name: test
    uid: 2c657af4-e829-43bf-947c-49fae9f8c5ea
  resourceVersion: "4829"
  uid: 359b538c-9901-46ed-aefe-a18ffac9bd58
spec:
  containers:
  - command:
    - argoexec
    - wait
    - --loglevel
    - info
    - --log-format
    - text
    env:
    - name: ARGO_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: ARGO_POD_UID
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.uid
    - name: GODEBUG
      value: x509ignoreCN=0
    - name: ARGO_WORKFLOW_NAME
      value: test
    - name: ARGO_CONTAINER_NAME
      value: wait
    - name: ARGO_TEMPLATE
      value: '{"name":"test","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls
        -l"],"resources":{}},"initContainers":[{"name":"inittest","image":"argoproj/argosay:v2","command":["bash","-c","
        . /dev/stdin \u003c\u003c\u003c \"return 1\" "],"resources":{}}]}'
    - name: ARGO_NODE_ID
      value: test
    - name: ARGO_INCLUDE_SCRIPT_OUTPUT
      value: "false"
    - name: ARGO_DEADLINE
      value: "0001-01-01T00:00:00Z"
    - name: ARGO_PROGRESS_FILE
      value: /var/run/argo/progress
    - name: ARGO_PROGRESS_PATCH_TICK_DURATION
      value: 1m0s
    - name: ARGO_PROGRESS_FILE_TICK_DURATION
      value: 3s
    image: quay.io/argoproj/argoexec:v3.4.5
    imagePullPolicy: IfNotPresent
    name: wait
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /tmp
      name: tmp-dir-argo
      subPath: "0"
    - mountPath: /var/run/argo
      name: var-run-argo
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-dvm7v
      readOnly: true
  - args:
    - ls -l
    command:
    - /var/run/argo/argoexec
    - emissary
    - --loglevel
    - info
    - --log-format
    - text
    - --
    - sh
    - -c
    env:
    - name: ARGO_CONTAINER_NAME
      value: main
    - name: ARGO_TEMPLATE
      value: '{"name":"test","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls
        -l"],"resources":{}},"initContainers":[{"name":"inittest","image":"argoproj/argosay:v2","command":["bash","-c","
        . /dev/stdin \u003c\u003c\u003c \"return 1\" "],"resources":{}}]}'
    - name: ARGO_NODE_ID
      value: test
    - name: ARGO_INCLUDE_SCRIPT_OUTPUT
      value: "false"
    - name: ARGO_DEADLINE
      value: "0001-01-01T00:00:00Z"
    - name: ARGO_PROGRESS_FILE
      value: /var/run/argo/progress
    - name: ARGO_PROGRESS_PATCH_TICK_DURATION
      value: 1m0s
    - name: ARGO_PROGRESS_FILE_TICK_DURATION
      value: 3s
    image: argoproj/argosay:v2
    imagePullPolicy: IfNotPresent
    name: main
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/argo
      name: var-run-argo
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-dvm7v
      readOnly: true
  dnsPolicy: ClusterFirst
  enableServiceLinks: true
  initContainers:
  - command:
    - argoexec
    - init
    - --loglevel
    - info
    - --log-format
    - text
    env:
    - name: ARGO_POD_NAME
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.name
    - name: ARGO_POD_UID
      valueFrom:
        fieldRef:
          apiVersion: v1
          fieldPath: metadata.uid
    - name: GODEBUG
      value: x509ignoreCN=0
    - name: ARGO_WORKFLOW_NAME
      value: test
    - name: ARGO_CONTAINER_NAME
      value: init
    - name: ARGO_TEMPLATE
      value: '{"name":"test","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls
        -l"],"resources":{}},"initContainers":[{"name":"inittest","image":"argoproj/argosay:v2","command":["bash","-c","
        . /dev/stdin \u003c\u003c\u003c \"return 1\" "],"resources":{}}]}'
    - name: ARGO_NODE_ID
      value: test
    - name: ARGO_INCLUDE_SCRIPT_OUTPUT
      value: "false"
    - name: ARGO_DEADLINE
      value: "0001-01-01T00:00:00Z"
    - name: ARGO_PROGRESS_FILE
      value: /var/run/argo/progress
    - name: ARGO_PROGRESS_PATCH_TICK_DURATION
      value: 1m0s
    - name: ARGO_PROGRESS_FILE_TICK_DURATION
      value: 3s
    image: quay.io/argoproj/argoexec:v3.4.5
    imagePullPolicy: IfNotPresent
    name: init
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/argo
      name: var-run-argo
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-dvm7v
      readOnly: true
  - command:
    - bash
    - -c
    - ' . /dev/stdin <<< "return 1" '
    env:
    - name: ARGO_CONTAINER_NAME
      value: inittest
    - name: ARGO_TEMPLATE
      value: '{"name":"test","inputs":{},"outputs":{},"metadata":{},"container":{"name":"","image":"argoproj/argosay:v2","command":["sh","-c"],"args":["ls
        -l"],"resources":{}},"initContainers":[{"name":"inittest","image":"argoproj/argosay:v2","command":["bash","-c","
        . /dev/stdin \u003c\u003c\u003c \"return 1\" "],"resources":{}}]}'
    - name: ARGO_NODE_ID
      value: test
    - name: ARGO_INCLUDE_SCRIPT_OUTPUT
      value: "false"
    - name: ARGO_DEADLINE
      value: "0001-01-01T00:00:00Z"
    - name: ARGO_PROGRESS_FILE
      value: /var/run/argo/progress
    - name: ARGO_PROGRESS_PATCH_TICK_DURATION
      value: 1m0s
    - name: ARGO_PROGRESS_FILE_TICK_DURATION
      value: 3s
    image: argoproj/argosay:v2
    imagePullPolicy: IfNotPresent
    name: inittest
    resources: {}
    terminationMessagePath: /dev/termination-log
    terminationMessagePolicy: File
    volumeMounts:
    - mountPath: /var/run/argo
      name: var-run-argo
    - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
      name: kube-api-access-dvm7v
      readOnly: true
  nodeName: desktop-nin3a8j
  preemptionPolicy: PreemptLowerPriority
  priority: 0
  restartPolicy: Never
  schedulerName: default-scheduler
  securityContext: {}
  serviceAccount: default
  serviceAccountName: default
  terminationGracePeriodSeconds: 30
  tolerations:
  - effect: NoExecute
    key: node.kubernetes.io/not-ready
    operator: Exists
    tolerationSeconds: 300
  - effect: NoExecute
    key: node.kubernetes.io/unreachable
    operator: Exists
    tolerationSeconds: 300
  volumes:
  - emptyDir: {}
    name: var-run-argo
  - emptyDir: {}
    name: tmp-dir-argo
  - name: kube-api-access-dvm7v
    projected:
      defaultMode: 420
      sources:
      - serviceAccountToken:
          expirationSeconds: 3607
          path: token
      - configMap:
          items:
          - key: ca.crt
            path: ca.crt
          name: kube-root-ca.crt
      - downwardAPI:
          items:
          - fieldRef:
              apiVersion: v1
              fieldPath: metadata.namespace
            path: namespace
status:
  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2023-03-23T09:24:41Z"
    message: 'containers with incomplete status: [inittest]'
    reason: ContainersNotInitialized
    status: "False"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2023-03-23T09:24:41Z"
    reason: PodFailed
    status: "False"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2023-03-23T09:24:41Z"
    reason: PodFailed
    status: "False"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2023-03-23T09:24:41Z"
    status: "True"
    type: PodScheduled
  containerStatuses:
  - image: argoproj/argosay:v2
    imageID: ""
    lastState: {}
    name: main
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        reason: PodInitializing
  - image: quay.io/argoproj/argoexec:v3.4.5
    imageID: ""
    lastState: {}
    name: wait
    ready: false
    restartCount: 0
    started: false
    state:
      waiting:
        reason: PodInitializing
  hostIP: 172.17.34.6
  initContainerStatuses:
  - containerID: containerd://bf6d30631954c3787a029a0aed4b921a6c8cbe2da83fd69341b5fc7cf335599f
    image: quay.io/argoproj/argoexec:v3.4.5
    imageID: quay.io/argoproj/argoexec@sha256:3f2ba3b355eccc61ec0e3b5c9fc547f4e9ff5821bf6f08485a6a5fc99bf7731b
    lastState: {}
    name: init
    ready: true
    restartCount: 0
    state:
      terminated:
        containerID: containerd://bf6d30631954c3787a029a0aed4b921a6c8cbe2da83fd69341b5fc7cf335599f
        exitCode: 0
        finishedAt: "2023-03-23T09:24:43Z"
        reason: Completed
        startedAt: "2023-03-23T09:24:42Z"
  - containerID: containerd://b168f2dbf466a58bdbf17fd4d3e42ba92ba6728ac35180a20e5ff4ac8a2d39cf
    image: docker.io/argoproj/argosay:v2
    imageID: docker.io/argoproj/argosay@sha256:f0b51e5d3a394af492de5ae50af93f11ba0862480d0eb2cbb48286fe58745303
    lastState: {}
    name: inittest
    ready: false
    restartCount: 0
    state:
      terminated:
        containerID: containerd://b168f2dbf466a58bdbf17fd4d3e42ba92ba6728ac35180a20e5ff4ac8a2d39cf
        exitCode: 1
        finishedAt: "2023-03-23T09:24:44Z"
        reason: Error
        startedAt: "2023-03-23T09:24:44Z"
  phase: Failed
  podIP: 10.42.0.17
  podIPs:
  - ip: 10.42.0.17
  qosClass: BestEffort
  startTime: "2023-03-23T09:24:41Z"

@terrytangyuan
Copy link
Member Author

This happens when ArgoWorkflow is deployed using the default official helm chart (controller image: quay.io/argoproj/workflow-controller@sha256:1d7d9691080d18f066a4ccdbd3138eb735d3da1172561afa5762e58e30f9ba85).

But if I build the controller myself using the Development Container as explained in the doc, on the v3.4.5 or latest tag, my workflow switches to Error as expected.

I think the Helm Chart might not be using the image that includes this fix.

@arnaud-soulie
Copy link

Strange since it is using the last official v3.4.5 image. The sha256 looks ok:
image

This is the last argo release.

@capacman
Copy link

In my case this occurs randomly. If workflows transition to failed, in logs i see "marking node as failed since init container has non-zero exit code", but if logs contains "Pod failed before main container starts" then it stays at running. I also tested with latest images yesterday but it stuck at running again.

@arnaud-soulie
Copy link

And I can add also this: I am sure that the fix is present in this image :

  • I tested with your own workflow (with GIt artifact): the behaviour is OK: the workflow fails as expected, and I see the new log time="2023-03-23T14:40:02.427Z" level=info msg="marking node as failed since init container has non-zero exit code" namespace=default new.phase=Failed workflow=git on the controller.
  • Still, I have the issue with my own simple workflow raising an initContainer error.

@capacman
Copy link

Currently i am using this to test and it stuck at pending

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  name: init-fail
spec:
  entrypoint: init-container-example
  templates:
  - name: init-container-example
    container:
      image: alpine:latest
      command: ["echo", "bye"]
      volumeMounts:
      - name: foo
        mountPath: /foo
    initContainers:
    - name: hello
      image: alpine:latest
      command: ["abcd"]
      mirrorVolumeMounts: true
  volumes:
    - name: foo
      emptyDir: {}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug type/regression Regression from previous behavior (a specific type of bug)
Projects
None yet
3 participants