Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infra improvements for Helix #68176

Closed
agocke opened this issue Apr 18, 2022 · 7 comments
Closed

Infra improvements for Helix #68176

agocke opened this issue Apr 18, 2022 · 7 comments

Comments

@agocke
Copy link
Member

agocke commented Apr 18, 2022

List of requests from Helix to reduce failure overhead from PayloadGroup0, which seems to be returning a non-zero exit code very often.

  • Investigate potentially reducing/combining files. There is significant overhead “per file”, and this run produces a lot of files that are fairly low value, from what I can see in them. Maybe it would be possible to only output a single, combined file? Or maybe only generate the files when something “interesting happens”?
  • Potentially name the PayloadGroup0 different things in different scenarios, so that it’s easier to identify where problems are coming from.
  • Investigate why this item is failing, on average, 1600 a day. It seems like if it’s failing this often, the value might be low… it seems unlikely anyone has the time to investigate 1600 failures every day. Maybe specific tests inside it can be disabled if we have some way to dig in a bit more about what’s going on inside them. (We might be able to help investigation here if you need)

Runfo Tracking Issue: payloadgroup0 work item

Build Definition Kind Run Name Console Core Dump Test Results Run Client
333853 runtime PR 88604 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333842 runtime PR 88345 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333824 runtime PR 88242 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333781 runtime PR 88415 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333768 runtime PR 88268 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
333768 runtime PR 88268 Mono browser wasm Release @ (Ubuntu.1804.Amd64)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-webassembly console.log runclient.py
333768 runtime PR 88268 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
333768 runtime PR 88268 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
333768 runtime PR 88268 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333768 runtime PR 88268 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
333768 runtime PR 88268 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
333750 runtime PR 88245 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333671 runtime PR 85562 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333574 runtime PR 88521 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333481 runtime PR 88595 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333388 runtime PR 88242 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333387 runtime PR 88242 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333378 runtime PR 87773 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333375 runtime PR 87260 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333367 runtime Rolling coreclr osx x64 Checked @ OSX.1200.Amd64.Open console.log runclient.py
333367 runtime Rolling coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333355 runtime PR 86787 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333354 runtime PR 86787 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333257 runtime PR 88584 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333211 runtime PR 88510 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333208 runtime PR 88572 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333196 runtime PR 80331 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333189 runtime PR 88559 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333168 runtime PR 88543 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333161 runtime Rolling coreclr osx x64 Checked @ OSX.1200.Amd64.Open console.log runclient.py
333161 runtime Rolling coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333109 runtime PR 88554 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
333109 runtime PR 88554 coreclr windows x64 Checked @ Windows.10.Amd64.Open console.log core dump runclient.py
333063 runtime PR 88521 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332991 runtime Rolling coreclr osx x64 Checked @ OSX.1200.Amd64.Open console.log runclient.py
332991 runtime Rolling coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332945 runtime PR 88543 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332907 runtime PR 88034 coreclr linux x64 Checked no_tiered_compilation @ Ubuntu.1804.Amd64.Open console.log runclient.py
332907 runtime PR 88034 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332898 runtime PR 88543 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332849 runtime Rolling coreclr osx x64 Checked @ OSX.1200.Amd64.Open console.log runclient.py
332849 runtime Rolling coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332826 runtime PR 86841 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332801 runtime PR 88415 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
332801 runtime PR 88415 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
332773 runtime PR 88531 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332590 runtime PR 88543 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
332590 runtime PR 88543 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
332590 runtime PR 88543 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
332590 runtime PR 88543 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
332336 runtime PR 88531 coreclr osx x64 Checked no_tiered_compilation @ OSX.1200.Amd64.Open console.log runclient.py
332317 runtime PR 88415 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331541 runtime PR 88371 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331541 runtime PR 88371 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331541 runtime PR 88371 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331541 runtime PR 88371 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331541 runtime PR 88371 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331490 runtime PR 88404 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331490 runtime PR 88404 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331490 runtime PR 88404 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331490 runtime PR 88404 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331467 runtime PR 88508 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331467 runtime PR 88508 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331467 runtime PR 88508 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331467 runtime PR 88508 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331467 runtime PR 88508 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331430 runtime PR 88503 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331430 runtime PR 88503 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331430 runtime PR 88503 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331430 runtime PR 88503 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331430 runtime PR 88503 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331425 runtime PR 88502 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331425 runtime PR 88502 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331425 runtime PR 88502 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331425 runtime PR 88502 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331425 runtime PR 88502 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331294 runtime PR 88498 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331294 runtime PR 88498 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331294 runtime PR 88498 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331294 runtime PR 88498 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331203 runtime PR 86875 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331203 runtime PR 86875 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331203 runtime PR 86875 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331203 runtime PR 86875 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331203 runtime PR 86875 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331166 runtime PR 87562 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331166 runtime PR 87562 Mono browser wasm Release @ (Ubuntu.1804.Amd64)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-webassembly console.log runclient.py
331166 runtime PR 87562 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331166 runtime PR 87562 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331166 runtime PR 87562 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331166 runtime PR 87562 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331160 runtime PR 87857 mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331160 runtime PR 87857 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331160 runtime PR 87857 mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
331160 runtime PR 87857 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331160 runtime PR 87857 mono osx x64 Release @ OSX.1200.Amd64.Open console.log runclient.py
331159 runtime Rolling mono linux arm64 Release @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8 console.log runclient.py
331159 runtime Rolling Mono browser wasm Release @ (Ubuntu.1804.Amd64)Ubuntu.1804.Amd64.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-webassembly console.log runclient.py
331159 runtime Rolling mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
331159 runtime Rolling mono linux x64 Release @ Ubuntu.1804.Amd64.Open console.log runclient.py

Displaying 100 of 381 results

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
23 61 128
@ghost
Copy link

ghost commented Apr 18, 2022

Tagging subscribers to this area: @dotnet/runtime-infrastructure
See info in area-owners.md if you want to be subscribed.

Issue Details

List of requests from Helix to reduce failure overhead from PayloadGroup0, which seems to be returning a non-zero exit code very often.

  • Investigate potentially reducing/combining files. There is significant overhead “per file”, and this run produces a lot of files that are fairly low value, from what I can see in them. Maybe it would be possible to only output a single, combined file? Or maybe only generate the files when something “interesting happens”?
  • Potentially name the PayloadGroup0 different things in different scenarios, so that it’s easier to identify where problems are coming from.
  • Investigate why this item is failing, on average, 1600 a day. It seems like if it’s failing this often, the value might be low… it seems unlikely anyone has the time to investigate 1600 failures every day. Maybe specific tests inside it can be disabled if we have some way to dig in a bit more about what’s going on inside them. (We might be able to help investigation here if you need)
Author: agocke
Assignees: -
Labels:

area-Infrastructure

Milestone: -

@dotnet-issue-labeler dotnet-issue-labeler bot added the untriaged New issue has not been triaged by the area owner label Apr 18, 2022
@agocke agocke added this to the 7.0.0 milestone Apr 18, 2022
@agocke agocke removed the untriaged New issue has not been triaged by the area owner label Apr 18, 2022
@agocke
Copy link
Member Author

agocke commented Apr 18, 2022

cc @MattGal @ChadNedzlek

@ChadNedzlek
Copy link
Member

Updated the 1600 -> 100 (the 1600 was from a bad query I made). 100 is still a lot per day though.

@MattGal
Copy link
Member

MattGal commented Apr 18, 2022

I'm just following along for now; please ping me if you want me to participate in any investigations or check whether improvements work (though if the work items start passing, that's a very good sign)

@jtschuster jtschuster added the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Jul 15, 2022
@ghost
Copy link

ghost commented Jul 26, 2022

Tagging subscribers to this area: @hoyosjs
See info in area-owners.md if you want to be subscribed.

Issue Details

List of requests from Helix to reduce failure overhead from PayloadGroup0, which seems to be returning a non-zero exit code very often.

  • Investigate potentially reducing/combining files. There is significant overhead “per file”, and this run produces a lot of files that are fairly low value, from what I can see in them. Maybe it would be possible to only output a single, combined file? Or maybe only generate the files when something “interesting happens”?
  • Potentially name the PayloadGroup0 different things in different scenarios, so that it’s easier to identify where problems are coming from.
  • Investigate why this item is failing, on average, 1600 a day. It seems like if it’s failing this often, the value might be low… it seems unlikely anyone has the time to investigate 1600 failures every day. Maybe specific tests inside it can be disabled if we have some way to dig in a bit more about what’s going on inside them. (We might be able to help investigation here if you need)

Runfo Tracking Issue: payloadgroup0 work item

Build Definition Kind Run Name Console Core Dump Test Results Run Client
1902122 runtime Rolling coreclr windows x64 Checked no_tiered_compilation @ Windows.10.Amd64.Open console.log core dump runclient.py
1897450 runtime PR 72021 Mono Browser wasm Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
1895796 runtime PR 72229 coreclr Linux x64 Checked no_tiered_compilation @ Ubuntu.1804.Amd64.Open console.log core dump runclient.py
1895796 runtime PR 72229 coreclr Linux arm64 Checked no_tiered_compilation @ (Ubuntu.1804.Arm64.Open)Ubuntu.1804.Armarch.Open@mcr.microsoft.com/dotnet-buildtools/prereqs:ubuntu-18.04-helix-arm64v8-20210531091519-97d8652 console.log core dump runclient.py
1895796 runtime PR 72229 coreclr windows arm64 Checked no_tiered_compilation @ Windows.10.Arm64v8.Open console.log
1895796 runtime PR 72229 coreclr windows x64 Checked no_tiered_compilation @ Windows.10.Amd64.Open console.log core dump runclient.py
1892945 runtime PR 72529 mono windows x64 Release @ Windows.10.Amd64.Open console.log runclient.py
1892535 runtime PR 72517 Mono Browser wasm Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
1892303 runtime Rolling coreclr windows x64 Checked @ Windows.10.Amd64.Open console.log core dump runclient.py
1889847 runtime PR 72021 Mono Browser wasm Release @ Ubuntu.1804.Amd64.Open console.log runclient.py
1886956 runtime Rolling coreclr windows x64 Checked @ Windows.10.Amd64.Open console.log core dump runclient.py
1883558 runtime PR 62863 Mono Browser wasm Release @ Ubuntu.1804.Amd64.Open console.log runclient.py

Build Result Summary

Day Hit Count Week Hit Count Month Hit Count
1 7 9
Author: agocke
Assignees: -
Labels:

area-Infrastructure-coreclr, blocking-clean-ci

Milestone: 7.0.0

@jeffschwMSFT jeffschwMSFT removed the blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' label Aug 1, 2022
@hoyosjs
Copy link
Member

hoyosjs commented Aug 5, 2022

@agocke Is it a good idea to have a tracking issue of such a wide issue with runfo? (aka, if the common coreclr tests fails fails, immediately add it here)

@hoyosjs hoyosjs modified the milestones: 7.0.0, 8.0.0 Aug 10, 2022
@agocke
Copy link
Member Author

agocke commented Jul 10, 2023

I think this is no longer interesting -- we're tracking things at a more granular level now and this is just catching everything in coreclr Pri0. Closing.

@agocke agocke closed this as completed Jul 10, 2023
@ghost ghost locked as resolved and limited conversation to collaborators Aug 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Status: Done
Development

No branches or pull requests

8 participants
@agocke @danmoseley @MattGal @jeffschwMSFT @ChadNedzlek @hoyosjs @jtschuster and others