Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

extremely slow Network/Disk IO on Windows agent compared to Ubuntu/Mac #260

Open
2 of 5 tasks
jetersen opened this issue Jan 17, 2022 · 17 comments · May be fixed by #480
Open
2 of 5 tasks

extremely slow Network/Disk IO on Windows agent compared to Ubuntu/Mac #260

jetersen opened this issue Jan 17, 2022 · 17 comments · May be fixed by #480
Assignees
Labels
bug Something isn't working

Comments

@jetersen
Copy link

jetersen commented Jan 17, 2022

Description:

actions/runner-images#3577

Ubuntu agents have slightly higher IOPS disk performance configuration. We use install-dotnet.ps1 script for installation provided by DotNet team . The DownloadFile and Extract-Dotnet-Package functions are being slow. We will investigate how to improve performance those functions if we replace DownloadFile -> WebClient and Extract-Dotnet-Package -> 7zip

DownloadFile and Extract-Dotnet-Package are awfully slow. Like 3x slower!

SLOW

Task version:
v1.9.0

Platform:

  • Ubuntu
  • macOS
  • Windows

Runner type:

  • Hosted
  • Self-hosted

Repro steps:
https://github.com/jetersen/dotnet.restore.slow.github.action

Expected behavior:
Faster downloads

Actual behavior:
SLOW downloads

@jetersen
Copy link
Author

It can be fast:

FAST

@jetersen
Copy link
Author

Perhaps consider not using dotnet install script or is the option to contribute a fix to dotnet install script?

@vsafonkin
Copy link

Hi @jetersen, we will try to resolve this problem.

@PureKrome
Copy link
Contributor

Hi Team - any news on this?

@e-korolevskii
Copy link
Contributor

Hello @PureKrome,

So far no updates

@dsame
Copy link

dsame commented Nov 2, 2023

The problem is not reproduced anymore.

Based on multiple runs, the action does not take more than 15 seconds.
https://github.com/akv-demo/dotnet.restore.slow.github.action/actions/runs/6729109167

Most probably, the root cause of the problem was an infrastructure issue that has currently been resolved.

In case the problem reoccurs, the solution is to avoid bulk copying to the OS drive, similar to the workaround applied for the same problem in the actions/setup-go: actions/setup-go#393

@jetersen did it help?

@jetersen
Copy link
Author

jetersen commented Nov 2, 2023

@dsame I do not agree with the assessment that it is not reproducible 😓
Even with cache available Windows Server 2022 is still 20 seconds slower.
Creating the cache is still 1 minute longer than time of Ubuntu when running Windows Server 2022.

So definitely an improvement but I feel like windows can perform better.

image
image

https://github.com/jetersen/dotnet.restore.slow.github.action/actions/runs/6736238225
https://github.com/jetersen/dotnet.restore.slow.github.action/actions/runs/6736262624

@jetersen
Copy link
Author

jetersen commented Nov 2, 2023

I don't think it is fair to say look it is fixed for the actions/setup-dotnet when we are talking about a simple if check to see if .NET 6 is already available on the actions runner image 😓

Testing with .NET 8 preview shows ubuntu with 7 seconds vs +30 seconds sometime a little less. For actions/setup-dotnet.

https://github.com/jetersen/dotnet.restore.slow.github.action/actions/runs/6736383956/job/18311632176

While this issue remains open: #141 this will definitely not improve 😢

@dsame
Copy link

dsame commented Nov 3, 2023

dalyIsaac added a commit to dalyIsaac/Whim that referenced this issue Nov 3, 2023
Improved `commit` workflow job times from an average of 8m to:

- 6m 30s uncached
- 5m 30s cached

Times were improved by:

- Adding caching
- Installing packages to the `D:\` drive, as described in <actions/setup-dotnet#260 (comment)>
@jetersen
Copy link
Author

jetersen commented Nov 3, 2023

@dalyIsaac interesting approach, does that really save that much 🤔

@dalyIsaac
Copy link

I'm fairly happy with the gains I've seen, but admittedly I didn't conduct a very rigorous study.

sample # jobs mean median sample std dev
Installing on C:\ 16 02:16 02:27 00:30
Caching1 on C:\ 4 01:52 01:42 00:35
Installing on D:\ 12 01:37 01:34 00:24
Caching on D:\ 12 01:07 01:07 00:15

Footnotes

  1. Caching includes the actual caching and running dotnet restore. Cache sizes were about 700MB.

@dsame
Copy link

dsame commented Nov 6, 2023

Hello @jetersen

The quick fix is to set environment variable DOTNET_INSTALL_DIR to the value pointing to some path on the "D:" drive.

akv-demo/dotnet.restore.slow.github.action@45e801a#diff-b803fcb7f17ed9235f1e5cb1fcd2f5d3b2838429d4368ae4c57ce4436577f03fR15

This workaround is proven to solve the problem https://github.com/akv-demo/dotnet.restore.slow.github.action/actions/runs/6768243557/job/18392290993 and can be used until the action fix is available

@jetersen
Copy link
Author

jetersen commented Nov 6, 2023

@dsame perhaps some of these fixes should be raised with @actions/runner-images? I assume we are hitting similar IO restrictions on the Windows images as this affects all windows based hosted runners 🫠

@dsame
Copy link

dsame commented Nov 7, 2023

Hello @jetersen, generally it is a good idea but i doubt any of actions team can solve the problem with the infrastructure and most probably the infrastructure problem is not expected to be solved in the acceptable timeframe.

@PureKrome
Copy link
Contributor

but i doubt any of actions team can solve the problem with the infrastructure

why is this? because these are 2 independent teams within GH and even though the actions team could make some changes based on this thread (which would benefit all users, by default) ... the intra-team still need to also do some changes but you're suggesting that this is a low priority so .. they go 'meh' ?

@jetersen
Copy link
Author

jetersen commented Nov 7, 2023

Created actions/runner-images#8755 in hoping that we can find a generic solution. I was hoping they could simply change the disk setup on the windows packer scripts 🤔

@dsame dsame linked a pull request Nov 9, 2023 that will close this issue
2 tasks
@dsame dsame linked a pull request Nov 9, 2023 that will close this issue
2 tasks
@blackstars701
Copy link

hi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants