Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CPU usage increases after upgrading to .NET 6 #83505

Closed
t-bzhan opened this issue Mar 16, 2023 · 8 comments
Closed

CPU usage increases after upgrading to .NET 6 #83505

t-bzhan opened this issue Mar 16, 2023 · 8 comments
Labels
Milestone

Comments

@t-bzhan
Copy link

t-bzhan commented Mar 16, 2023

Description

Our ASP.NET Core service's CPU usage increases a lot after upgrading from .NET Core 3.1 to .NET 6.0.14, we collected a trace using dotnet-trace and found that our problem is pretty similar with #62967. We tried to disable the usage of portable thread pool with setting COMPlus_ThreadPool_UsePortableThreadPool=0, the trick works in our test environment, we observed CPU usage decreased as expected under stressing test, however in production, we noticed that the problem persisted.

I compared the trace under ".NET 6" with ".NET Core 3.1" and found the increased CPU usage might be caused by Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrastructure.Heartbeat.TimerLoop() and System.Diagnostics.Tracing.CounterGroup.PollForValues()
image

image

Could anyone suggest what steps I need to take to resolve this?

I could not share the netrace file here as it contains production data but I could share it to MS internal members, feel free to drop me a message if a netrace file could help.

Configuration

# dotnet --info
.NET SDK (reflecting any global.json):
Version:   6.0.406
Commit:    2988897946 Runtime Environment:
OS Name:     ubuntu
OS Version:  22.04
OS Platform: Linux
RID:         ubuntu.22.04-x64
Base Path:   /usr/share/dotnet/sdk/6.0.406/ 

global.json file:
  Not found
Host:
  Version:      6.0.14
  Architecture: x64
  Commit:       2a90daa2cc 

.NET SDKs installed:
  6.0.406 [/usr/share/dotnet/sdk] 

.NET runtimes installed:
  Microsoft.AspNetCore.App 6.0.14 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 6.0.14 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

Regression?

That looks like a regression, I did not observe similar CPU usage in .NET Core 3.1
image

Data

CPU usage increased after we rollout new version which upgrade to .NET 6.

image

Analysis

akkadotnet/akka.net#5385 (comment) mentioned that similar CPU regression could be fixed by upgrading the runtime version to 6.0.6, however our runtime version is 6.0.14 already.

For Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrastructure.Heartbeat.TimerLoop()'s CPU usage dotnet/aspnetcore#47014 explained that the data is actually somewhat misleading, since it includes time that the thread was waiting as well.

@t-bzhan t-bzhan added the tenet-performance Performance related issue label Mar 16, 2023
@dotnet-issue-labeler
Copy link

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

@ghost ghost added the untriaged New issue has not been triaged by the area owner label Mar 16, 2023
@ghost
Copy link

ghost commented Mar 16, 2023

Tagging subscribers to this area: @mangod9
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

Our ASP.NET Core service's CPU usage increases a lot after upgrading from .NET Core 3.1 to .NET 6.0.14, we collected a trace using dotnet-trace and found that our problem is pretty similar with #62967. We tried to disable the usage of portable thread pool with setting COMPlus_ThreadPool_UsePortableThreadPool=0, the trick works in our test environment, we observed CPU usage decreased as expected under stressing test, however in production, we noticed that the problem persisted.

I compared the trace under ".NET 6" with ".NET Core 3.1" and found the increased CPU usage might be caused by Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrastructure.Heartbeat.TimerLoop() and System.Diagnostics.Tracing.CounterGroup.PollForValues()
image

image

Could anyone suggest what steps I need to take to resolve this?

I could not share the netrace file here as it contains production data but I could share it to MS internal members, feel free to drop me a message if a netrace file could help.

Configuration

# dotnet --info
.NET SDK (reflecting any global.json):
Version:   6.0.406
Commit:    2988897946 Runtime Environment:
OS Name:     ubuntu
OS Version:  22.04
OS Platform: Linux
RID:         ubuntu.22.04-x64
Base Path:   /usr/share/dotnet/sdk/6.0.406/ 

global.json file:
  Not found
Host:
  Version:      6.0.14
  Architecture: x64
  Commit:       2a90daa2cc 

.NET SDKs installed:
  6.0.406 [/usr/share/dotnet/sdk] 

.NET runtimes installed:
  Microsoft.AspNetCore.App 6.0.14 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
  Microsoft.NETCore.App 6.0.14 [/usr/share/dotnet/shared/Microsoft.NETCore.App]

Regression?

That looks like a regression, I did not observe similar CPU usage in .NET Core 3.1
image

Data

CPU usage increased after we rollout new version which upgrade to .NET 6.

image

Analysis

akkadotnet/akka.net#5385 (comment) mentioned that similar CPU regression could be fixed by upgrading the runtime version to 6.0.6, however our runtime version is 6.0.14 already.

For Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Infrastructure.Heartbeat.TimerLoop()'s CPU usage dotnet/aspnetcore#47014 explained that the data is actually somewhat misleading, since it includes time that the thread was waiting as well.

Author: t-bzhan
Assignees: -
Labels:

area-System.Threading, tenet-performance, untriaged

Milestone: -

@mangod9
Copy link
Member

mangod9 commented Mar 17, 2023

@kouvel to take a look. @t-bzhan, assume you see the same behavior on 7?

@mangod9 mangod9 removed the untriaged New issue has not been triaged by the area owner label Mar 17, 2023
@mangod9 mangod9 added this to the Future milestone Mar 17, 2023
@En3Tho
Copy link
Contributor

En3Tho commented Mar 18, 2023

One more question: have you observed other metrics change? Like RPS for example or reduced latency? Maybe something else?

@t-bzhan
Copy link
Author

t-bzhan commented Mar 18, 2023

@kouvel to take a look. @t-bzhan, assume you see the same behavior on 7?

I did not test our service behavior on .NET 7 as our pipeline does not support it yet.

@t-bzhan
Copy link
Author

t-bzhan commented Mar 18, 2023

One more question: have you observers other metrics change? Like RPS for example or reduced latency? Maybe something else?

I noticed that the average response time increase around 10ms when the number of request counts remains the same level.
image

I've also collected the dontnet perf counters in .NET 6, not sure whether that is helpful:
image
image
image

I kind of feeling like the issue might be caused by threadpool, I checked the Microsoft-Windows-DotNETRuntime/ThreadPoolWorkerThreadAdjustment/Adjustment event with perfview, however I am not able to figure out whether that is abnormal or not:
image

@kouvel
Copy link
Member

kouvel commented Mar 28, 2023

Chatted offline, the times shown in the first two stacks are likely mostly blocked time and not CPU time, suggested using perfcollect instead.

@t-bzhan
Copy link
Author

t-bzhan commented Apr 25, 2023

That is related to #84760, close it.

@t-bzhan t-bzhan closed this as completed Apr 25, 2023
@ghost ghost locked as resolved and limited conversation to collaborators May 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants