Skip to content

Commit

Permalink
Introduce cross-process resource management for tasks (#5859)
Browse files Browse the repository at this point in the history
Add RequestCores and ReleaseCores APIs to IBuildEngine9.

These APIs can advise a task that wishes to do work with its own internal parallelism. The task can request as many (abstracted) CPU cores as it desires, and the MSBuild engine will keep track of how many have been requested and prevent the machine from being completely overloaded.

Since this is advisory only, existing tasks will be unaffected. The Visual C++ tasks plan to opt into this.

See resource-management.md for more design and implementation details.

Fixes #74
  • Loading branch information
rainersigwald committed Mar 30, 2021
1 parent afd0b62 commit 1629921
Show file tree
Hide file tree
Showing 50 changed files with 1,592 additions and 137 deletions.
41 changes: 41 additions & 0 deletions documentation/specs/resource-management.md
@@ -0,0 +1,41 @@
# Managing tools with their own parallelism in MSBuild

MSBuild supports building projects in parallel using multiple processes. Most users opt into `Environment.ProcessorCount` parallelism at the MSBuild layer.

In addition, tools sometimes support parallel execution. The Visual C++ compiler `cl.exe` supports `/MP[n]`, which parallelizes compilation at the translation-unit (file) level. If a number isn't specified, it defaults to `NUM_PROCS`.

When used in combination, `NUM_PROCS * NUM_PROCS` compiler processes can be launched, all of which would like to do file I/O and intense computation. This generally overwhelms the operating system's scheduler and causes thrashing and terrible build times.

As a result, the standard guidance is to use only one multiproc option: MSBuild's _or_ `cl.exe`'s. But that leaves the machine underloaded when things could be happening in parallel.

## Design

`IBuildEngine` will be extended to allow a task to indicate to MSBuild that it would like to consume more than one CPU core (`RequestCores`). These will be advisory only — a task can still do as much work as it desires with as many threads and processes as it desires.

A cooperating task would limit its own parallelism to the number of CPU cores MSBuild can reserve for the requesting task.

`RequestCores(int requestedCores)` will always return a positive value, possibly less than the parameter if that many cores are not available. If no cores are available at the moment, the call blocks until at least one becomes available. The first `RequestCores` call made by a task is guaranteed to be non-blocking, though, as at minimum it will return the "implicit" core allocated to the task itself. This leads to two conceptual ways of adopting the API. Either the task calls `RequestCores` once, passing the desired number of cores, and then limiting its parallelism to whatever the call returns. Or the task makes additional calls throughout its execution, perhaps as it discovers more work to do. In this second scenario the task must be OK with waiting for additional cores for a long time or even forever if the sum of allocated cores has exceeded the limit defined by the policy.

All resources acquired by a task will be automatically returned when the task's `Execute()` method returns, and a task can optionally return a subset by calling `ReleaseCores`. Additionally, all resources will be returned when the task calls `Reacquire` as this call is a signal to the scheduler that external tools have finished their work and the task can continue running. It does not matter when the resources where allocated - whether it was before or after calling `Yield` - they will all be released. Depending on the scheduling policy, freeing resources on `Reacquire` may prevent deadlocks.

The exact core reservation policy and its interaction with task execution scheduling is still TBD. The pool of resources explicitly allocated by tasks may be completely separate, i.e. MSBuild will not wait until a resource is freed before starting execution of new tasks. Or it may be partially or fully shared to prevent oversubscribing the machine. In general, `ReleaseCores` may cause a transition of a waiting task to a Ready state. And vice-versa, completing a task or calling `Yield` may unblock a pending `RequestCores` call issued by a task.

## Example 1

In a 16-process build of a solution with 30 projects, 16 worker nodes are launched and begin executing work. Most block on dependencies to projects `A`, `B`, `C`, `D`, and `E`, so they don't have tasks running holding resources.

Task `Work` is called in project `A` with 25 inputs. It would like to run as many as possible in parallel. It calls

```C#
int allowedParallelism = BuildEngine8.RequestCores(Inputs.Count); // Inputs.Count == 25
```

and gets up to `16`--the number of cores available to the build overall.

While `A` runs `Work`, projects `B` and `C` run another task `Work2` that also calls `RequestCores` with a high value. Since `Work` in `A` has reserved all cores, the calls in `B` and `C` may return only 1, indicating that the task should not be doing parallel work. Subsequent `RequestCores` may block, waiting on `Work` to release cores (or return).

When `Work` returns, MSBuild automatically returns all resources reserved by the task to the pool. At that time blocked `RequestCores` calls in `Work2` may unblock.

## Implementation

The `RequestCores` and `ReleaseCores` calls are marshaled back to the scheduler via newly introduced `INodePacket` implementations. The scheduler, having full view of the state of the system - i.e. number of build requests running, waiting, yielding, ..., number of cores explicitly allocated by individual tasks using the new API - is free to implement an arbitrary core allocation policy. In the initial implementation the policy will be controlled by a couple of environment variables to make it easy to test different settings.
Expand Up @@ -218,6 +218,11 @@ public partial interface IBuildEngine8 : Microsoft.Build.Framework.IBuildEngine,
{
bool ShouldTreatWarningAsError(string warningCode);
}
public partial interface IBuildEngine9 : Microsoft.Build.Framework.IBuildEngine, Microsoft.Build.Framework.IBuildEngine2, Microsoft.Build.Framework.IBuildEngine3, Microsoft.Build.Framework.IBuildEngine4, Microsoft.Build.Framework.IBuildEngine5, Microsoft.Build.Framework.IBuildEngine6, Microsoft.Build.Framework.IBuildEngine7, Microsoft.Build.Framework.IBuildEngine8
{
void ReleaseCores(int coresToRelease);
int RequestCores(int requestedCores);
}
public partial interface ICancelableTask : Microsoft.Build.Framework.ITask
{
void Cancel();
Expand Down
Expand Up @@ -218,6 +218,11 @@ public partial interface IBuildEngine8 : Microsoft.Build.Framework.IBuildEngine,
{
bool ShouldTreatWarningAsError(string warningCode);
}
public partial interface IBuildEngine9 : Microsoft.Build.Framework.IBuildEngine, Microsoft.Build.Framework.IBuildEngine2, Microsoft.Build.Framework.IBuildEngine3, Microsoft.Build.Framework.IBuildEngine4, Microsoft.Build.Framework.IBuildEngine5, Microsoft.Build.Framework.IBuildEngine6, Microsoft.Build.Framework.IBuildEngine7, Microsoft.Build.Framework.IBuildEngine8
{
void ReleaseCores(int coresToRelease);
int RequestCores(int requestedCores);
}
public partial interface ICancelableTask : Microsoft.Build.Framework.ITask
{
void Cancel();
Expand Down
Expand Up @@ -354,6 +354,7 @@ public abstract partial class Task : Microsoft.Build.Framework.ITask
public Microsoft.Build.Framework.IBuildEngine6 BuildEngine6 { get { throw null; } }
public Microsoft.Build.Framework.IBuildEngine7 BuildEngine7 { get { throw null; } }
public Microsoft.Build.Framework.IBuildEngine8 BuildEngine8 { get { throw null; } }
public Microsoft.Build.Framework.IBuildEngine9 BuildEngine9 { get { throw null; } }
protected string HelpKeywordPrefix { get { throw null; } set { } }
public Microsoft.Build.Framework.ITaskHost HostObject { get { throw null; } set { } }
public Microsoft.Build.Utilities.TaskLoggingHelper Log { get { throw null; } }
Expand Down
Expand Up @@ -199,6 +199,7 @@ public abstract partial class Task : Microsoft.Build.Framework.ITask
public Microsoft.Build.Framework.IBuildEngine6 BuildEngine6 { get { throw null; } }
public Microsoft.Build.Framework.IBuildEngine7 BuildEngine7 { get { throw null; } }
public Microsoft.Build.Framework.IBuildEngine8 BuildEngine8 { get { throw null; } }
public Microsoft.Build.Framework.IBuildEngine9 BuildEngine9 { get { throw null; } }
protected string HelpKeywordPrefix { get { throw null; } set { } }
public Microsoft.Build.Framework.ITaskHost HostObject { get { throw null; } set { } }
public Microsoft.Build.Utilities.TaskLoggingHelper Log { get { throw null; } }
Expand Down
27 changes: 27 additions & 0 deletions src/Build.UnitTests/BackEnd/BuildRequestEngine_Tests.cs
Expand Up @@ -74,6 +74,8 @@ public MockRequestBuilder()

public event BuildRequestBlockedDelegate OnBuildRequestBlocked;

public event ResourceRequestDelegate OnResourceRequest;

public void BuildRequest(NodeLoggingContext context, BuildRequestEntry entry)
{
Assert.Null(_builderThread); // "Received BuildRequest while one was in progress"
Expand Down Expand Up @@ -171,6 +173,11 @@ public void RaiseRequestBlocked(BuildRequestEntry entry, int blockingId, string
OnBuildRequestBlocked?.Invoke(entry, blockingId, blockingTarget, null);
}

public void RaiseResourceRequest(ResourceRequest request)
{
OnResourceRequest?.Invoke(request);
}

public void ContinueRequest()
{
if (ThrowExceptionOnContinue)
Expand All @@ -180,6 +187,10 @@ public void ContinueRequest()
_continueEvent.Set();
}

public void ContinueRequestWithResources(ResourceResponse response)
{
}

public void CancelRequest()
{
this.BeginCancel();
Expand Down Expand Up @@ -256,6 +267,9 @@ private ProjectInstance CreateStandinProject()
private AutoResetEvent _engineExceptionEvent;
private Exception _engineException_Exception;

private AutoResetEvent _engineResourceRequestEvent;
private ResourceRequest _engineResourceRequest_Request;

private IBuildRequestEngine _engine;
private IConfigCache _cache;
private int _nodeRequestId;
Expand All @@ -272,6 +286,7 @@ public BuildRequestEngine_Tests()
_newRequestEvent = new AutoResetEvent(false);
_newConfigurationEvent = new AutoResetEvent(false);
_engineExceptionEvent = new AutoResetEvent(false);
_engineResourceRequestEvent = new AutoResetEvent(false);

_engine = (IBuildRequestEngine)_host.GetComponent(BuildComponentType.RequestEngine);
_cache = (IConfigCache)_host.GetComponent(BuildComponentType.ConfigCache);
Expand All @@ -293,6 +308,7 @@ public void Dispose()
_newRequestEvent.Dispose();
_newConfigurationEvent.Dispose();
_engineExceptionEvent.Dispose();
_engineResourceRequestEvent.Dispose();

_host = null;
}
Expand All @@ -305,6 +321,7 @@ private void ConfigureEngine(IBuildRequestEngine engine)
engine.OnRequestResumed += this.Engine_RequestResumed;
engine.OnStatusChanged += this.Engine_EngineStatusChanged;
engine.OnEngineException += this.Engine_Exception;
engine.OnResourceRequest += this.Engine_ResourceRequest;
}

/// <summary>
Expand Down Expand Up @@ -579,5 +596,15 @@ private void Engine_Exception(Exception e)
_engineException_Exception = e;
_engineExceptionEvent.Set();
}

/// <summary>
/// Callback for event raised when resources are requested.
/// </summary>
/// <param name="request">The resource request</param>
private void Engine_ResourceRequest(ResourceRequest request)
{
_engineResourceRequest_Request = request;
_engineResourceRequestEvent.Set();
}
}
}
15 changes: 15 additions & 0 deletions src/Build.UnitTests/BackEnd/TargetBuilder_Tests.cs
Expand Up @@ -1417,6 +1417,21 @@ void IRequestBuilderCallback.ExitMSBuildCallbackState()
{
}

/// <summary>
/// Empty impl
/// </summary>
int IRequestBuilderCallback.RequestCores(object monitorLockObject, int requestedCores, bool waitForCores)
{
return 0;
}

/// <summary>
/// Empty impl
/// </summary>
void IRequestBuilderCallback.ReleaseCores(int coresToRelease)
{
}

#endregion

/// <summary>
Expand Down
15 changes: 15 additions & 0 deletions src/Build.UnitTests/BackEnd/TargetEntry_Tests.cs
Expand Up @@ -978,6 +978,21 @@ void IRequestBuilderCallback.ExitMSBuildCallbackState()
{
}

/// <summary>
/// Empty impl
/// </summary>
int IRequestBuilderCallback.RequestCores(object monitorLockObject, int requestedCores, bool waitForCores)
{
return 0;
}

/// <summary>
/// Empty impl
/// </summary>
void IRequestBuilderCallback.ReleaseCores(int coresToRelease)
{
}

#endregion

/// <summary>
Expand Down
15 changes: 15 additions & 0 deletions src/Build.UnitTests/BackEnd/TaskBuilder_Tests.cs
Expand Up @@ -755,6 +755,21 @@ void IRequestBuilderCallback.ExitMSBuildCallbackState()
{
}

/// <summary>
/// Empty impl
/// </summary>
int IRequestBuilderCallback.RequestCores(object monitorLockObject, int requestedCores, bool waitForCores)
{
return 0;
}

/// <summary>
/// Empty impl
/// </summary>
void IRequestBuilderCallback.ReleaseCores(int coresToRelease)
{
}

#endregion

#region IRequestBuilderCallback Members
Expand Down

0 comments on commit 1629921

Please sign in to comment.