Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

symlink not preserved in tar input artifacts #9948

Closed
3 tasks done
swhite24 opened this issue Nov 2, 2022 · 1 comment · Fixed by #9949
Closed
3 tasks done

symlink not preserved in tar input artifacts #9948

swhite24 opened this issue Nov 2, 2022 · 1 comment · Fixed by #9949
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc type/bug

Comments

@swhite24
Copy link
Contributor

swhite24 commented Nov 2, 2022

Pre-requisites

  • I have double-checked my configuration
  • I can confirm the issues exists when I tested with :latest
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what you expected to happen?

For versions > v3.4.x, symlinks appear to be missing when consuming input artifacts that should contain symlinks. This is not the case for versions before v3.4.x (tested with v3.3.9 and a few others).

Here's example output from the provided workflow:

create-artifact

total 4
-rw-rw-rw-    1 root     root             6 Nov  2 14:03 hello.txt
lrwxrwxrwx    1 root     root            16 Nov  2 14:03 link.txt -> output/hello.txt
time="2022-11-02T14:03:42.738Z" level=info msg="sub-process exited" argo=true error="<nil>"
time="2022-11-02T14:03:42.738Z" level=info msg="/app/output -> /var/run/argo/outputs/artifacts/app/output.tgz" argo=true
time="2022-11-02T14:03:42.738Z" level=info msg="Taring /app/output"
time="2022-11-02T14:03:42.738Z" level=info msg="archived 3 files/dirs in /app/output"

consume-artifact

total 4
-rw-rw-rw-    1 root     root             6 Nov  2 14:03 hello.txt
time="2022-11-02T14:03:52.701Z" level=info msg="sub-process exited" argo=true error="<nil>"

Version

v3.4.3

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: symlink-example-
spec:
  entrypoint: main
  templates:
    - name: main
      dag:
        tasks:
          - name: create-artifact
            template: create-artifact
          - arguments:
              artifacts:
                - from: "{{tasks.create-artifact.outputs.artifacts.output}}"
                  name: output
            depends: create-artifact
            name: consume-artifact
            template: consume-artifact
    - name: create-artifact
      script:
        image: alpine:latest
        command: [sh]
        source: |
          mkdir output
          echo "hello" > output/hello.txt
          ln -s output/hello.txt output/link.txt
          ls -al output
        workingDir: /app
      outputs:
        artifacts:
          - name: output
            path: /app/output
    - name: consume-artifact
      script:
        image: alpine:latest
        command: [sh]
        source: |
          ls -al output
        workingDir: /app
      inputs:
        artifacts:
          - name: output
            path: /app/output

Logs from the workflow controller

time="2022-11-02T14:30:12.445Z" level=info msg="Processing workflow" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.451Z" level=info msg="Updated phase -> Running" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.452Z" level=info msg="DAG node symlink-example-s497w initialized Running" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.452Z" level=info msg="All of node symlink-example-s497w.create-artifact dependencies [] completed" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.453Z" level=info msg="Pod node symlink-example-s497w-4036058673 initialized Pending" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.468Z" level=info msg="Created pod: symlink-example-s497w.create-artifact (symlink-example-s497w-4036058673)" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.482Z" level=info msg="TaskSet Reconciliation" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.482Z" level=info msg=reconcileAgentPod namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:12.489Z" level=info msg="Workflow update successful" namespace=tangram-platform phase=Running resourceVersion=26033 workflow=symlink-example-s497w
time="2022-11-02T14:30:22.469Z" level=info msg="Processing workflow" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:22.469Z" level=info msg="Task-result reconciliation" namespace=tangram-platform numObjs=1 workflow=symlink-example-s497w
time="2022-11-02T14:30:22.470Z" level=info msg="task-result changed" namespace=tangram-platform nodeID=symlink-example-s497w-4036058673 workflow=symlink-example-s497w
time="2022-11-02T14:30:22.470Z" level=info msg="node changed" namespace=tangram-platform new.message= new.phase=Succeeded new.progress=0/1 nodeID=symlink-example-s497w-4036058673 old.message= old.phase=Pending old.progress=0/1 workflow=symlink-example-s497w
time="2022-11-02T14:30:22.470Z" level=info msg="All of node symlink-example-s497w.consume-artifact dependencies [create-artifact] completed" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:22.470Z" level=info msg="Pod node symlink-example-s497w-2032903531 initialized Pending" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:22.482Z" level=info msg="Created pod: symlink-example-s497w.consume-artifact (symlink-example-s497w-2032903531)" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:22.482Z" level=info msg="TaskSet Reconciliation" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:22.482Z" level=info msg=reconcileAgentPod namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:22.489Z" level=info msg="Workflow update successful" namespace=tangram-platform phase=Running resourceVersion=26148 workflow=symlink-example-s497w
time="2022-11-02T14:30:22.494Z" level=info msg="cleaning up pod" action=labelPodCompleted key=tangram-platform/symlink-example-s497w-4036058673/labelPodCompleted
time="2022-11-02T14:30:32.464Z" level=info msg="Processing workflow" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.464Z" level=info msg="Task-result reconciliation" namespace=tangram-platform numObjs=1 workflow=symlink-example-s497w
time="2022-11-02T14:30:32.465Z" level=info msg="node changed" namespace=tangram-platform new.message= new.phase=Succeeded new.progress=0/1 nodeID=symlink-example-s497w-2032903531 old.message= old.phase=Pending old.progress=0/1 workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="Outbound nodes of symlink-example-s497w set to [symlink-example-s497w-2032903531]" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="node symlink-example-s497w phase Running -> Succeeded" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="node symlink-example-s497w finished: 2022-11-02 14:30:32.466523996 +0000 UTC" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="Checking daemoned children of symlink-example-s497w" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="TaskSet Reconciliation" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg=reconcileAgentPod namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="Updated phase Running -> Succeeded" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="Marking workflow completed" namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.466Z" level=info msg="Checking daemoned children of " namespace=tangram-platform workflow=symlink-example-s497w
time="2022-11-02T14:30:32.472Z" level=info msg="cleaning up pod" action=deletePod key=tangram-platform/symlink-example-s497w-1340600742-agent/deletePod
time="2022-11-02T14:30:32.480Z" level=info msg="Workflow update successful" namespace=tangram-platform phase=Succeeded resourceVersion=26196 workflow=symlink-example-s497w
time="2022-11-02T14:30:32.567Z" level=info msg="cleaning up pod" action=labelPodCompleted key=tangram-platform/symlink-example-s497w-2032903531/labelPodCompleted

Logs from in your workflow's wait container

time="2022-11-02T14:30:16.024Z" level=info msg="S3 Save path: /tmp/argo/outputs/artifacts/output.tgz, key: symlink-example-s497w/symlink-example-s497w-4036058673/output.tgz"
time="2022-11-02T14:30:16.024Z" level=info msg="Creating minio client using static credentials" endpoint="minio:9000"
time="2022-11-02T14:30:16.025Z" level=info msg="Saving file to s3" bucket=argo-artifacts endpoint="minio:9000" key=symlink-example-s497w/symlink-example-s497w-4036058673/output.tgz path=/tmp/argo/outputs/artifacts/output.tgz
time="2022-11-02T14:30:16.036Z" level=info msg="Save artifact" artifactName=output duration=11.440802ms error="" key=symlink-example-s497w/symlink-example-s497w-4036058673/output.tgz
time="2022-11-02T14:30:16.036Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/artifacts/output.tgz
time="2022-11-02T14:30:16.036Z" level=info msg="Successfully saved file: /tmp/argo/outputs/artifacts/output.tgz"
time="2022-11-02T14:30:16.043Z" level=info msg="Create workflowtaskresults 201"
time="2022-11-02T14:30:16.044Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2022-11-02T14:30:16.044Z" level=info msg="Deadline monitor stopped"
time="2022-11-02T14:30:16.044Z" level=info msg="Alloc=7066 TotalAlloc=12898 Sys=19666 NumGC=4 Goroutines=9"
time="2022-11-02T14:30:24.007Z" level=info msg="Starting Workflow Executor" version=v3.4.3
time="2022-11-02T14:30:24.009Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2022-11-02T14:30:24.009Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=tangram-platform podName=symlink-example-s497w-2032903531 template="{"name":"consume-artifact","inputs":{"artifacts":[{"name":"output","path":"/app/output","s3":{"key":"symlink-example-s497w/symlink-example-s497w-4036058673/output.tgz"}}]},"outputs":{},"metadata":{},"script":{"name":"","image":"alpine:latest","command":["sh"],"workingDir":"/app","resources":{},"source":"ls -al output\n"},"archiveLocation":{"archiveLogs":false,"s3":{"endpoint":"minio:9000","bucket":"argo-artifacts","insecure":true,"accessKeySecret":{"name":"minio","key":"root-user"},"secretKeySecret":{"name":"minio","key":"root-password"},"key":"symlink-example-s497w/symlink-example-s497w-2032903531"}}}" version="&Version{Version:v3.4.3,BuildDate:2022-10-31T05:40:15Z,GitCommit:eddb1b78407adc72c08b4ed6be8f52f2a1f1316a,GitTag:v3.4.3,GitTreeState:clean,GoVersion:go1.18.7,Compiler:gc,Platform:linux/amd64,}"
time="2022-11-02T14:30:24.010Z" level=info msg="Starting deadline monitor"
time="2022-11-02T14:30:26.012Z" level=info msg="Main container completed" error=""
time="2022-11-02T14:30:26.012Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2022-11-02T14:30:26.012Z" level=info msg="No output parameters"
time="2022-11-02T14:30:26.012Z" level=info msg="No output artifacts"
time="2022-11-02T14:30:26.012Z" level=info msg="Alloc=6842 TotalAlloc=12068 Sys=19666 NumGC=4 Goroutines=7"

@swhite24
Copy link
Contributor Author

swhite24 commented Nov 2, 2022

Did a little investigation and assume this is due to the implementation change in untar introduced in #8292, as only files regular files are being handled now.

v3.3.9 implementation:
https://github.com/argoproj/argo-workflows/blob/v3.3.9/workflow/executor/executor.go#L814-L823

current:

func untar(tarPath string, destPath string) error {
decompressor := func(src string, dest string) error {
f, err := os.Open(src)
if err != nil {
return err
}
defer f.Close()
gzr, err := file.GetGzipReader(f)
if err != nil {
return err
}
defer gzr.Close()
tr := tar.NewReader(gzr)
for {
header, err := tr.Next()
switch {
case err == io.EOF:
return nil
case err != nil:
return err
case header == nil:
continue
}
target := filepath.Join(dest, filepath.Clean(header.Name))
if err := os.MkdirAll(filepath.Dir(target), 0o700); err != nil && os.IsExist(err) {
return err
}
switch header.Typeflag {
case tar.TypeReg:
f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, os.FileMode(header.Mode))
if err != nil {
return err
}
if _, err := io.Copy(f, tr); err != nil {
return err
}
if err := f.Close(); err != nil {
return err
}
}
}
}
return unpack(tarPath, destPath, decompressor)
}

terrytangyuan pushed a commit that referenced this issue Nov 2, 2022
Signed-off-by: Steven White <swhitewvu24@gmail.com>
juchaosong pushed a commit to juchaosong/argo-workflows that referenced this issue Nov 3, 2022
Signed-off-by: Steven White <swhitewvu24@gmail.com>
Signed-off-by: juchao <juchao@coscene.io>
@agilgur5 agilgur5 added the area/artifacts S3/GCP/OSS/Git/HDFS etc label Sep 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc type/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants