Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When sharing the terminal with child nodes, wait for the children to terminate before exiting ourselves. #6053

Merged
merged 19 commits into from Feb 25, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
10 changes: 5 additions & 5 deletions scripts/Deploy-MSBuild.ps1
Expand Up @@ -74,11 +74,11 @@ $filesToCopyToBin = @(
FileToCopy "$bootstrapBinDirectory\Microsoft.Managed.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.Managed.Before.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.Managed.After.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.Net.props"
FileToCopy "$bootstrapBinDirectory\Microsoft.NetFramework.CurrentVersion.props"
FileToCopy "$bootstrapBinDirectory\Microsoft.NetFramework.CurrentVersion.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.NetFramework.props"
FileToCopy "$bootstrapBinDirectory\Microsoft.NetFramework.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.NET.props"
tmds marked this conversation as resolved.
Show resolved Hide resolved
FileToCopy "$bootstrapBinDirectory\Microsoft.NETFramework.CurrentVersion.props"
FileToCopy "$bootstrapBinDirectory\Microsoft.NETFramework.CurrentVersion.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.NETFramework.props"
FileToCopy "$bootstrapBinDirectory\Microsoft.NETFramework.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.VisualBasic.CrossTargeting.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.VisualBasic.CurrentVersion.targets"
FileToCopy "$bootstrapBinDirectory\Microsoft.VisualBasic.targets"
Expand Down
6 changes: 3 additions & 3 deletions src/Build/BackEnd/BuildManager/BuildManager.cs
Expand Up @@ -629,7 +629,7 @@ void Callback(object state)
}
}

ShutdownConnectedNodesAsync(true /* abort */);
ShutdownConnectedNodes(true /* abort */);
CheckForActiveNodesAndCleanUpSubmissions();
}
}
Expand Down Expand Up @@ -774,7 +774,7 @@ public void EndBuild()
try
{
_noActiveSubmissionsEvent.WaitOne();
ShutdownConnectedNodesAsync(false /* normal termination */);
ShutdownConnectedNodes(false /* normal termination */);
_noNodesActiveEvent.WaitOne();

// Wait for all of the actions in the work queue to drain. Wait() could throw here if there was an unhandled exception
Expand Down Expand Up @@ -1955,7 +1955,7 @@ public void Dispose()
/// Asks the nodeManager to tell the currently connected nodes to shut down and sets a flag preventing all non-shutdown-related packets from
/// being processed.
/// </summary>
private void ShutdownConnectedNodesAsync(bool abort)
private void ShutdownConnectedNodes(bool abort)
{
_shuttingDown = true;

Expand Down
147 changes: 120 additions & 27 deletions src/Build/BackEnd/Components/Communications/NodeProviderOutOfProcBase.cs
Expand Up @@ -25,6 +25,10 @@
using Microsoft.Build.Utilities;

using BackendNativeMethods = Microsoft.Build.BackEnd.NativeMethods;
using Task = System.Threading.Tasks.Task;
using DotNetFrameworkArchitecture = Microsoft.Build.Shared.DotNetFrameworkArchitecture;
using Microsoft.Build.Framework;
using Microsoft.Build.BackEnd.Logging;

namespace Microsoft.Build.BackEnd
{
Expand All @@ -49,6 +53,11 @@ internal abstract class NodeProviderOutOfProcBase
/// </summary>
private const int TimeoutForNewNodeCreation = 30000;

/// <summary>
/// The amount of time to wait for an out-of-proc node to exit.
/// </summary>
private const int TimeoutForWaitForExit = 30000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this this is the right timeout? Node creation takes longer than exiting, I would have thought.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends if we expect it to be common for nodes to fail to shut down when asked to.
If we think in the common case they should be able to shut down, we can keep the long timeout.
If in the common case they won't shut down when asked to and we don't care about the reason, we can use a lower timeout.


/// <summary>
/// The build component host.
/// </summary>
Expand Down Expand Up @@ -95,9 +104,30 @@ protected void ShutdownConnectedNodes(List<NodeContext> contextsToShutDown, bool
// Send the build completion message to the nodes, causing them to shutdown or reset.
_processesToIgnore.Clear();

// We wait for child nodes to exit to avoid them changing the terminal
// after this process terminates.
bool waitForExit = !enableReuse &&
!Console.IsInputRedirected &&
tmds marked this conversation as resolved.
Show resolved Hide resolved
Traits.Instance.EscapeHatches.EnsureStdOutForChildNodesIsPrimaryStdout;
Forgind marked this conversation as resolved.
Show resolved Hide resolved

Task[] waitForExitTasks = waitForExit && contextsToShutDown.Count > 0 ? new Task[contextsToShutDown.Count] : null;
int i = 0;
var loggingService = _componentHost.LoggingService;
foreach (NodeContext nodeContext in contextsToShutDown)
{
nodeContext?.SendData(new NodeBuildComplete(enableReuse));
if (nodeContext is null)
{
continue;
}
nodeContext.SendData(new NodeBuildComplete(enableReuse));
if (waitForExit)
tmds marked this conversation as resolved.
Show resolved Hide resolved
{
waitForExitTasks[i++] = nodeContext.WaitForExitAsync(loggingService);
}
}
if (waitForExitTasks != null)
{
Task.WaitAll(waitForExitTasks);
}
}

Expand Down Expand Up @@ -138,7 +168,7 @@ protected void ShutdownAllNodes(bool nodeReuse, NodeContextTerminateDelegate ter
{
// If we're able to connect to such a process, send a packet requesting its termination
CommunicationsUtilities.Trace("Shutting down node with pid = {0}", nodeProcess.Id);
NodeContext nodeContext = new NodeContext(0, nodeProcess.Id, nodeStream, factory, terminateNode);
NodeContext nodeContext = new NodeContext(0, nodeProcess, nodeStream, factory, terminateNode);
nodeContext.SendData(new NodeBuildComplete(false /* no node reuse */));
nodeStream.Dispose();
}
Expand Down Expand Up @@ -204,7 +234,7 @@ protected NodeContext GetNode(string msbuildLocation, string commandLineArgs, in
{
// Connection successful, use this node.
CommunicationsUtilities.Trace("Successfully connected to existed node {0} which is PID {1}", nodeId, nodeProcess.Id);
return new NodeContext(nodeId, nodeProcess.Id, nodeStream, factory, terminateNode);
return new NodeContext(nodeId, nodeProcess, nodeStream, factory, terminateNode);
}
}
}
Expand Down Expand Up @@ -242,20 +272,20 @@ protected NodeContext GetNode(string msbuildLocation, string commandLineArgs, in
#endif

// Create the node process
int msbuildProcessId = LaunchNode(msbuildLocation, commandLineArgs);
_processesToIgnore.Add(GetProcessesToIgnoreKey(hostHandshake, msbuildProcessId));
Process msbuildProcess = LaunchNode(msbuildLocation, commandLineArgs);
_processesToIgnore.Add(GetProcessesToIgnoreKey(hostHandshake, msbuildProcess.Id));

// Note, when running under IMAGEFILEEXECUTIONOPTIONS registry key to debug, the process ID
// gotten back from CreateProcess is that of the debugger, which causes this to try to connect
// to the debugger process. Instead, use MSBUILDDEBUGONSTART=1

// Now try to connect to it.
Stream nodeStream = TryConnectToProcess(msbuildProcessId, TimeoutForNewNodeCreation, hostHandshake);
Stream nodeStream = TryConnectToProcess(msbuildProcess.Id, TimeoutForNewNodeCreation, hostHandshake);
if (nodeStream != null)
{
// Connection successful, use this node.
CommunicationsUtilities.Trace("Successfully connected to created node {0} which is PID {1}", nodeId, msbuildProcessId);
return new NodeContext(nodeId, msbuildProcessId, nodeStream, factory, terminateNode);
CommunicationsUtilities.Trace("Successfully connected to created node {0} which is PID {1}", nodeId, msbuildProcess.Id);
return new NodeContext(nodeId, msbuildProcess, nodeStream, factory, terminateNode);
}
}

Expand Down Expand Up @@ -391,7 +421,7 @@ private Stream TryConnectToProcess(int nodeProcessId, int timeout, Handshake han
/// <summary>
/// Creates a new MSBuild process
/// </summary>
private int LaunchNode(string msbuildLocation, string commandLineArgs)
private Process LaunchNode(string msbuildLocation, string commandLineArgs)
{
// Should always have been set already.
ErrorUtilities.VerifyThrowInternalLength(msbuildLocation, nameof(msbuildLocation));
Expand Down Expand Up @@ -490,7 +520,7 @@ private int LaunchNode(string msbuildLocation, string commandLineArgs)
}

CommunicationsUtilities.Trace("Successfully launched {1} node with PID {0}", process.Id, exeName);
return process.Id;
return process;
}
else
{
Expand Down Expand Up @@ -548,7 +578,7 @@ out processInfo
}

CommunicationsUtilities.Trace("Successfully launched {1} node with PID {0}", childProcessId, exeName);
return childProcessId;
return Process.GetProcessById(childProcessId);
}
}

Expand Down Expand Up @@ -582,6 +612,13 @@ private static string GetCurrentHost()
/// </summary>
internal class NodeContext
{
enum ExitPacketState
{
None,
ExitPacketQueued,
ExitPacketSent
}

// The pipe(s) used to communicate with the node.
private Stream _clientToServerStream;
private Stream _serverToClientStream;
Expand All @@ -597,9 +634,9 @@ internal class NodeContext
private int _nodeId;

/// <summary>
/// The process id
/// The node process.
/// </summary>
private int _processId;
private readonly Process _process;

/// <summary>
/// An array used to store the header byte for each packet when read.
Expand Down Expand Up @@ -631,14 +668,14 @@ internal class NodeContext
private Task _packetWriteDrainTask = Task.CompletedTask;

/// <summary>
/// Event indicating the node has terminated.
/// Delegate called when the context terminates.
/// </summary>
private ManualResetEvent _nodeTerminated;
private NodeContextTerminateDelegate _terminateDelegate;

/// <summary>
/// Delegate called when the context terminates.
/// Tracks the state of the packet sent to terminate the node.
/// </summary>
private NodeContextTerminateDelegate _terminateDelegate;
private ExitPacketState _exitPacketState;

/// <summary>
/// Per node read buffers
Expand All @@ -648,20 +685,18 @@ internal class NodeContext
/// <summary>
/// Constructor.
/// </summary>
public NodeContext(int nodeId, int processId,
public NodeContext(int nodeId, Process process,
Stream nodePipe,
INodePacketFactory factory, NodeContextTerminateDelegate terminateDelegate)
{
_nodeId = nodeId;
_processId = processId;
_process = process;
_clientToServerStream = nodePipe;
_serverToClientStream = nodePipe;
_packetFactory = factory;
_headerByte = new byte[5]; // 1 for the packet type, 4 for the body length

_readBufferMemoryStream = new MemoryStream();
_writeBufferMemoryStream = new MemoryStream();
_nodeTerminated = new ManualResetEvent(false);
_terminateDelegate = terminateDelegate;
_sharedReadBuffer = InterningBinaryReader.CreateSharedBuffer();
}
Expand Down Expand Up @@ -749,6 +784,10 @@ public async Task RunPacketReadLoopAsync()
/// <param name="packet">The packet to send.</param>
public void SendData(INodePacket packet)
{
if (IsExitPacket(packet))
{
_exitPacketState = ExitPacketState.ExitPacketQueued;
}
_packetWriteQueue.Add(packet);
DrainPacketQueue();
}
Expand Down Expand Up @@ -816,6 +855,10 @@ private void SendDataCore(INodePacket packet)
int lengthToWrite = Math.Min(writeStreamLength - i, MaxPacketWriteSize);
_serverToClientStream.Write(writeStreamBuffer, i, lengthToWrite);
}
if (IsExitPacket(packet))
{
_exitPacketState = ExitPacketState.ExitPacketSent;
}
}
catch (IOException e)
{
Expand All @@ -828,6 +871,11 @@ private void SendDataCore(INodePacket packet)
}
}

private static bool IsExitPacket(INodePacket packet)
{
return packet is NodeBuildComplete buildCompletePacket && !buildCompletePacket.PrepareForReuse;
}

/// <summary>
/// Avoid having a BinaryWriter just to write a 4-byte int
/// </summary>
Expand All @@ -842,7 +890,7 @@ private void WriteInt32(MemoryStream stream, int value)
/// <summary>
/// Closes the node's context, disconnecting it from the node.
/// </summary>
public void Close()
private void Close()
{
_clientToServerStream.Dispose();
if (!object.ReferenceEquals(_clientToServerStream, _serverToClientStream))
Expand All @@ -852,6 +900,52 @@ public void Close()
_terminateDelegate(_nodeId);
}

/// <summary>
/// Waits for the child node process to exit.
/// </summary>
public async Task WaitForExitAsync(ILoggingService loggingService)
{
if (_exitPacketState == ExitPacketState.ExitPacketQueued)
{
// Wait up to 100ms until all remaining packets are sent.
// We don't need to wait long, just long enough for the Task to start running on the ThreadPool.
await Task.WhenAny(_packetWriteDrainTask, Task.Delay(100));
}
if (_exitPacketState == ExitPacketState.ExitPacketSent)
{
CommunicationsUtilities.Trace("Waiting for node with pid = {0} to exit", _process.Id);

// .NET 5 introduces a real WaitForExitAsyc.
// This is a poor man's implementation that uses polling.
int timeout = TimeoutForWaitForExit;
int delay = 5;
while (timeout > 0)
{
bool exited = _process.WaitForExit(milliseconds: 0);
if (exited)
{
return;
}
timeout -= delay;
await Task.Delay(delay).ConfigureAwait(false);

// Double delay up to 500ms.
delay = Math.Min(delay * 2, 500);
}
}

// Kill the child and do a blocking wait.
loggingService?.LogWarning(
BuildEventContext.Invalid,
null,
BuildEventFileInfo.Empty,
"KillingProcessWithPid",
_process.Id);
CommunicationsUtilities.Trace("Killing node with pid = {0}", _process.Id);
tmds marked this conversation as resolved.
Show resolved Hide resolved
tmds marked this conversation as resolved.
Show resolved Hide resolved
tmds marked this conversation as resolved.
Show resolved Hide resolved

_process.KillTree(timeout: 5000);
}

#if FEATURE_APM
/// <summary>
/// Completes the asynchronous packet write to the node.
Expand All @@ -873,17 +967,16 @@ private bool ProcessHeaderBytesRead(int bytesRead)
{
if (bytesRead != _headerByte.Length)
{
CommunicationsUtilities.Trace(_nodeId, "COMMUNICATIONS ERROR (HRC) Node: {0} Process: {1} Bytes Read: {2} Expected: {3}", _nodeId, _processId, bytesRead, _headerByte.Length);
CommunicationsUtilities.Trace(_nodeId, "COMMUNICATIONS ERROR (HRC) Node: {0} Process: {1} Bytes Read: {2} Expected: {3}", _nodeId, _process.Id, bytesRead, _headerByte.Length);
try
{
Process childProcess = Process.GetProcessById(_processId);
if (childProcess?.HasExited != false)
if (_process.HasExited)
{
CommunicationsUtilities.Trace(_nodeId, " Child Process {0} has exited.", _processId);
CommunicationsUtilities.Trace(_nodeId, " Child Process {0} has exited.", _process.Id);
}
else
{
CommunicationsUtilities.Trace(_nodeId, " Child Process {0} is still running.", _processId);
CommunicationsUtilities.Trace(_nodeId, " Child Process {0} is still running.", _process.Id);
}
}
catch (Exception e) when (!ExceptionHandling.IsCriticalException(e))
Expand Down
1 change: 1 addition & 0 deletions src/Build/BackEnd/TaskExecutionHost/TaskExecutionHost.cs
Expand Up @@ -24,6 +24,7 @@
using Microsoft.Build.Utilities;

using TaskItem = Microsoft.Build.Execution.ProjectItemInstance.TaskItem;
using Task = System.Threading.Tasks.Task;

namespace Microsoft.Build.BackEnd
{
Expand Down
1 change: 1 addition & 0 deletions src/Build/Microsoft.Build.csproj
Expand Up @@ -127,6 +127,7 @@
<Compile Include="..\Shared\NodeBuildComplete.cs" />
<Compile Include="..\Shared\NodeShutdown.cs" />
<Compile Include="..\Shared\NamedPipeUtil.cs" />
<Compile Include="..\Shared\ProcessExtensions.cs" />
<Compile Include="..\Shared\TaskLoggingHelper.cs">
<Link>BackEnd\Components\RequestBuilder\IntrinsicTasks\TaskLoggingHelper.cs</Link>
<ExcludeFromStyleCop>True</ExcludeFromStyleCop>
Expand Down
3 changes: 3 additions & 0 deletions src/Build/Resources/Strings.resx
Expand Up @@ -1876,4 +1876,7 @@ Utilization: {0} Average Utilization: {1:###.0}</value>
<data name="NoProjectCachePluginFoundInAssembly" xml:space="preserve">
<value>MSB4270: No project cache plugins found in assembly "{0}". Expected one.</value>
</data>
<data name="KillingProcessWithPid" xml:space="preserve">
<value>Killing process with pid = {0}.</value>
</data>
</root>
5 changes: 5 additions & 0 deletions src/Build/Resources/xlf/Strings.cs.xlf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.