Implement Cleanup { } Function Block #9900

vexx32 · 2019-06-14T12:41:57Z

PR Summary

Adds a new cleanup keyword and named block for both script commands and advanced functions.

RFC: PowerShell/PowerShell-RFC#207

Changes

Added cleanup {} function block, including all necessary AST changes to ScriptBlockAst, new token type, tokenizer changes, etc.
Modifies pipeline command handling to automatically call any command's Dispose() implementation (including the new cleanup {} block type) immediately after end {}/EndProcessing() is called, so as to dispose of function resources as soon as a function's task in the pipeline is complete.
Added handling to Dispose() method in the CommandProcessorBase to call the Dispose() methods for script cmdlets and functions as well.
Utilised existing IDisposable implementation to call the cleanup {} script block during that code path for script cmdlets. Simple functions (utilising DlrScriptCommandProcessor) already inherit the Dispose() implementation from CommandProcessorBase so a method was added to handle their disposal by calling their cleanup{} blocks.

Cleanup { } behaviour

Success output is silently swallowed in the Cleanup step.
Data sent to other streams will function normally (verbose, information, error, etc.)
The Cleanup step is intended to occupy an unskippable step specifically for logging & cleanup purposes to improve the reliability of script commands.
All errors that occur within a Cleanup block are propagated as per normal function operation.
Un-caught terminating errors that occur during a cleanup {} step will skip the remainder of that Cleanup block but will not prevent subsequent cmdlets' own Cleanup {} steps from being invoked, nor interrupt any disposal of the overall pipeline.
If a script cmdlet or function is disposed from a different thread, the method will instead invoke the method via an event triggered on the original thread, to avoid issues with scripts being invoked on the wrong thread.

PR Context

For functions and script cmdlets that utilise pipeline input, there is currently no possible way to reliably utilise a resource that should be timely disposed. Resources can be disposed by the pipeline, but typically this does not occur until the completion of all commands in the pipeline.

For scripts and modules that interact with external resources, this can still become problematic. This PR attempts to address this gap and bring script cmdlets closer to parity with compiled cmdlets, which are perfectly capable of simply implementing IDisposable and taking necessary actions there.

Fix #6673

EditorSyntax PR: PowerShell/EditorSyntax#186

Open / Complicated Questions

What is the desired behaviour when a terminating error is thrown from both cleanup{} and another function block (i.e., a terminating error is thrown during begin, process, or end, and when cleanup is invoked, it also throws a terminating error).
- Current behaviour in the PR: surface the error from cleanup (this is the same as what would happen in C# if you throw both from try and finally, from what I can tell, in that you can only catch / see the error from the finally), but both errors still surface in $error.
Locking / critical code section behaviour (see @rjmholt's concerns in Implement Cleanup { } Function Block #9900 (comment))

Additional Asks

Add tests for cmdlets that implement IDisposable to verify their current and future behaviour

PR Checklist

src/Microsoft.PowerShell.Commands.Utility/commands/utility/ImplicitRemotingCommands.cs

BrucePay · 2019-06-29T00:03:03Z

As this is a change to the core language it should go through the RFC process.

vexx32 · 2019-06-29T14:36:11Z

@BrucePay please see PowerShell/PowerShell-RFC#207 😄

vexx32 · 2021-03-26T12:28:32Z

This is not meant to be an automatic thing to release resources. The code must still be written to do so. But if such code cannot be guaranteed to be run without users accidentally cancelling it out of impatience, there isn't much point in writing it, is there?

daxian-dbw · 2021-03-26T19:20:19Z

This requirement (guarantee a block of code will be run to clean up state, etc) is the reason this PR exists at all, and there really isn't much point for the feature if it doesn't behave as expected out of the box.

I don't agree with this. As I mentioned above, the very first time I heard about an ask for this, is to have a finally-like thing that covers all Begin/Process/End blocks, so that terminating errors thrown from any of those blocks can be handled without needing try/finally redundantly in those blocks with the same or similar cleanup code. So even the Clean block can be interrupted by Ctrl+c, it's meaningful and useful because it makes handling terminating errors easier for functions.

When it comes to Ctrl+c and the interactive scenario, the question is "does the difference between 'guaranteed' and 'best-effort' really matters practically". After all, the finally block is interruptible since it was introduced to the language, and I personally don't know of a real world scenario where it's proven to be not sufficient (of course, @joeyaiello would know better, and I will definitely discuss with him).

As you brought up, after suspending a stopping pipeline, the finally block cannot be interrupted by subsequent Ctrl+c. From the implementation perspective, this is because ConsoleHost today simply ignores the signal when it finds there is already one stopping thread on-going, and thus the _pipelineStopper.IsStopping cannot be flipped back to true. However, in case that the stop is initiated through API PowerShell.Stop, then it's possible to call PowerShell.Stop again on a different thread to stop the finally block. Nevertheless, the behavior of finally with regarding to Ctrl+c is inconsistent today because of how ConsoleHost handles the signal.

For now, I will put aside the discussion about how Clean should handle Ctrl+c (need some time to talk with other folks on team about it), and will focus more on how to fit the Clean in the code as a peer of Begin/Process/End.

iSazonov · 2021-03-26T19:20:46Z

But if such code cannot be guaranteed to be run without users accidentally cancelling it out of impatience, there isn't much point in writing it, is there?

What's worse than what's happening now could happen? :-)))

You are ignoring the fact that this is a shell and it should behave like a shell as indicated above by @JamesWTruher.

For some reason, you think PowerShell users are much more stupid than script writers. Aren't they more often than not the same people? :-) And won't there be a lot of modules in which this block will be used incorrectly and work so badly that it should be interrupted?

Again, there is no guarantee that resources will be freed, even in C#. The situation is even worse in PowerShell. In fact, this has not been a tragedy for many years. The formulated problem looks rather contrived.

jborean93 · 2021-03-27T05:22:52Z

I am really +1 to both @vexx32 and @SeeminglyScience points here in that cleanup should really not be user interruptible. The whole purpose of this block is to cleanup any resources that need to be and the best person who knows what needs to be done is the script author not the end user. I don't think that I can add any more arguments to what they've already provided except that I agree with their points and believe the benefits outweigh the disadvantages. I still think having the cleanup block is a really nice feature that I would love to use, I just don't think that it should be somewhat muzzled in functionality like finally is.

The main argument I can see here for being able to interrupt this with ctrl+c is that PowerShell is a shell and that everything should be cancel-able. Why is this argument being applied to a script function and not to binary cmdlets. Using the example in #9900 (comment) we can see that binary modules just plain ignore ctrl+c normally.

Add-Type -OutputAssembly MyModule.dll -TypeDefinition @'
using System;
using System.Management.Automation;
using System.Threading;

namespace MyModule
{
    [Cmdlet("Test" , "Cmdlet")]
    public class TestCmdletCommand : PSCmdlet
    {
        protected override void EndProcessing()
        {
            try
            {
                Console.WriteLine("press ctrl + c");
                Thread.Sleep(5000);
                Console.WriteLine("end try");
            }
            finally {
                Console.WriteLine("press ctrl + c again");
                Thread.Sleep(5000);
                Console.WriteLine("end finally");
            }
        }
    }
}
'@

$module = 'MyModule'
$manifestSplat = @{
    Path              = 'MyModule.psd1'
    NestedModules     = @('MyModule.dll')
    FunctionsToExport = @('Test-Cmdlet')
}
New-ModuleManifest @manifestSplat

Import-Module .\MyModule.psd1

Test-Cmdlet

If the argument is that people who are writing binary modules should know what they are doing and thus not create situations where this occurs then I feel you are wrong. I've seen varying qualities in both PowerShell scripts and compiled code, just because something is written in C# doesn't mean it's better quality code. What about running a process that hooks into ctrl+c and completely ignores the signal that is sent to it causing the shell to hang anyway. Removing this functionality due to the risk of poor programming just seems wrong and is hampering good functionality without addressing the problem at hand.

One of the reasons why I avoid binary cmdlets is because I dislike having to deal with compiling code, shipping multiple dlls, being harder to debug as easily, and not not being able to unload an assembly. Script functions are so much simpler to write and definitely has a lower barrier for entry. By having this feature you are in fact making it possible to write better script functions now that they can cleanup resources they open in some of the more exceptional cases.

Ultimately this is a really nice feature to have and if the powers that be wish to make it interruptible I would be disappointed but still excited to be able to use the cleanup block.

vexx32 · 2021-03-28T03:00:15Z

Thanks for your input @jborean93!

Sorry for making this a bit long; it's hard to keep things short when it feels like there are multiple ongoing conversations at once, but I have tried to keep each bit as brief as I could manage.

I think to some degree the conversation here really boils down to this point:

When it comes to Ctrl+C and the interactive scenario, the question is "does the difference between 'guaranteed' and 'best-effort' really matter practically"

And there isn't really one answer, there's a few.

For automation scenarios: the difference is largely nonexistent, except if the automation platform supports halting the running session abruptly.
For interactive scenarios, working with functions that may be handling IO handles, COM objects, alternate screen buffers, certain disposable objects (which may be better served by a separate, more targeted solution)... the difference can matter quite a bit.

The other point being discussed here is concerns around letting folks run non-interruptible code. Most of the counterpoints here make it sound like a non-interruptible section of code is an exceptional case. At least in my experience using PowerShell, it's really not unusual at all. Ctrl+C is never as responsive as we'd all like, I'm sure. For me this really boils down to two points:

Ignoring Ctrl+C isn't really that rare anyway, and
In my opinion, having some form of provision for these use cases is exceedingly useful overall, even if the total number of cases where it's strictly required is not immensely high.

If cleanup{} doesn't provide a guarantee that necessary code will be run, I think folks that need that to work will simply turn back to using their C# solutions as Patrick mentioned and their code will end up blocking anyway, so those concerns are a bit moot to me. If folks want to block the pipeline to get things done, they can and will do it regardless. This simply enables them to do it in a bit less haphazard manner than they'd otherwise need to use.

@iSazonov

Again, there is no guarantee that resources will be freed, even in C#. The situation is even worse in PowerShell. In fact, this has not been a tragedy for many years. The formulated problem looks rather contrived.

(emphasis is mine)

That's precisely why I wrote the code this way and spent all the effort making it a bit easier to handle from PowerShell. As you say, the situation in PowerShell for properly cleaning up resources or state, is quite a bit worse than C# at the moment. This provides an avenue to rectifying that disparity.

Whether or not you think it's a problem largely depends on the kinds of code you interact with on a daily basis. Perhaps our spheres of operation are a bit far apart here, but I tend to end up moving to C# because PowerShell simply doesn't provide what I'm looking for these days.

mattpwhite · 2021-03-30T17:32:01Z

I don't have much to add beyond strongly echoing @vexx32 here. This is a feature that makes dealing with things that require deterministic cleanup in PowerShell significantly easier than it is currently. Making it interruptible greatly limits the utility of the feature, to the point that I'm not sure I can think of a case where we'd actually employ it. It would become one more PowerShell feature that would require a nuanced explanation to people who are unfamiliar with PowerShell's numerous idiosyncrasies. The cognitive load of all these gotchas is such that it is extraordinarily difficult to teach anyone, from an experienced .NET developer to a sysadmin dabbling in some scripting, how to do anything in a robust and reliable way in PowerShell. All these efforts to make bad code (e.g. a cleanup block that did some huge amount of work and would block an interactive shell for a meaningful amount of time) have fewer negative consequences end up making it fantastically difficult to write good code.

It's far easier to explain "ctrl+c doesn't interrupt cleanup, so be careful about what you do there" to someone than it is to explain that "you shouldn't use cleanup unless your scenario is one where interrupting cleanup may be OK, consider the implications of that at any point in your whole function".

daxian-dbw · 2021-04-09T20:51:19Z

For now, I will put aside the discussion about how Clean should handle Ctrl+c (need some time to talk with other folks on team about it), and will focus more on how to fit the Clean in the code as a peer of Begin/Process/End.

I now have the prototype for an alternative design available at #15177. It's been rebased to the master, with all existing tests passed. The major change is around 2 aspects:

Moved the invocation of Clean out of Dispose, and added a pass in SynchronousExecuteEnumerate to execute Clean blocks after the pipeline execution is done.
- So the Clean blocks run after all End blocks finish when the pipeline execution was successful, which is different from the design in this PR that tries to run a command's Clean right after its End is done.
- It's absolutely doable to run a command's Clean right after its End, but I didn't do it in my prototype for 2 reasons: (1) it introduces more complexity; (2) the benefit is not clear to me. The quote below is from my early response to this topic in Implement Cleanup { } Function Block #9900 (comment):
  
  I... don't agree. Part of the hassle of pipelines in some cases is that resources from an upstream command cannot be disposed early enough currently. As an offhand example, the case of Get-Content $file | .... | Set-Content $file (I think there may have been a specific fix for this not that long ago, actually? Don't remember if this actually currently works or not, but it didn't for a very, very long time).
  
  It's not clear to me how much it would help to solve that problem by running Clean right after each End. For the Get-Content $file | .... | Set-Content example, both Get-Content and Set-Content try to open the same file in the Begin block, and thus freeing the file after End of Get-Content won't help. It may help only if the conflicting resource is used only in the End block of the downstream functions.
Focused on the error handling of the Clean block. The proposal is captured in details at Add clean block to script block as a peer to begin, process, and end to allow easy resource cleanup #15177 (comment), and is implemented in the prototype. Any feedback to this proposal is welcome.

The prototype doesn't do anything about Ctrl+c handling, since we don't yet have an agreement there. I haven't gone through all needed changes in steppable pipeline yet. We will need the committee's review on the error handling proposal, and if it is accepted, the next step will be writing a lot tests for the error handling. Also, we will need the committee's review on Ctrl+c handling as well.

ghost · 2021-05-08T14:00:07Z

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days.
Maintainer, please provide feedback and/or mark it as Waiting on Author

ghost · 2021-09-23T02:01:17Z

This pull request has been automatically marked as Review Needed because it has been there has not been any activity for 7 days.
Maintainer, please provide feedback and/or mark it as Waiting on Author

daxian-dbw · 2021-10-14T09:28:06Z

This PR was superseded by PR #15177, which was merged, so closing this PR. Again, thanks @vexx32 for the initial hard work!

This comment has been minimized.

Sign in to view

vexx32 mentioned this pull request Jun 16, 2019

PS implementation of the C# Using statement #9886

Closed

rjmholt closed this Jun 17, 2019

rjmholt reopened this Jun 17, 2019

PoshChan reviewed Jun 17, 2019

View reviewed changes