-
Notifications
You must be signed in to change notification settings - Fork 7.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get-Content should release its file handle as early as possible #10025
Comments
Pipeline executes BeginProcessing for every command in the pipeline, then ProcessRecord for every command in the pipeline and then EndProcessing for every command in the pipeline. It is not clear that we could fix here. |
Aye, I suspect the main opportunity here would be doing this for the Potentially, |
If every command in the pipeline always writes an object for $one = { end { 0; Write-Host '1 end called' }}
$two = { process { $_ } end { Write-Host '2 end called' }}
$three = { process { } end { Write-Host '3 end called'; 0..2;}}
$four = { process { Write-Host $_ } end { Write-Host '4 end called' }}
& $one | & $two | & $three | & $four
1 end called
2 end called
3 end called
0
1
2
4 end called In the original issue, the idea was that if you add |
This looks like a workaround but cannot be a general fix, as it causes the accumulation of information that for large files will lead to memory exhaustion. Even in other languages, the best practice is to write to an auxiliary file and then rename it to the original one that is more safe. I guess this can be achieved with Rename-Item in the pipeline end. |
Sure, it can be done manually, but it would be nice if PS could handle that behind the scenes, in my opinion. 😄 Also, that still can't be done in a single pipeline, I don't think, since currently Get-Content won't release the handle until the entire pipeline is complete regardless of the use case.:/ |
Sounds more like a need for an optional switch on |
That would be OK for small to medium files, but if you're writing large files it will probably need to buffer to a temporary file, or we'll have huge memory usage. 😕 |
But that's what you're doing by using |
Even with downstream buffering the operation is dangerous. We should use intermediate/temporary file and rename it or use a transactional file system. |
I agree. My point was just about doing "the work" downstream. Writing to a temp file and then moving it and/or doing transactional work would be the right way to go. |
That's sooo close to
(nb. small risk of using |
Some filesystems support transactions, but unfortunately PS's provider does not. But yeah writing to a temp file would be a good solution if the file is still open when Set-Content tries to get a write handle. |
The code is in the repo but was commented at porting time (Windows only). |
This same class of problem applies to I do think this is a bit of a trap for new users (and people like me, who forget this happens), because on its face this pattern is perfectly reasonable: Get-Something | Format-Something | Write-Something But because file cmdlets (at least (Get-Something) | Format-Something | Write-Something Which is both non-intuitive and a little mysterious if you're not familiar with why that pattern is being used. |
This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you. |
2 similar comments
This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you. |
This issue has not had any activity in 6 months, if this is a bug please try to reproduce on the latest version of PowerShell and reopen a new issue and reference this issue if this is still a blocker for you. |
This issue has been marked as "No Activity" as there has been no activity for 6 months. It has been closed for housekeeping purposes. |
Summary of the new feature/enhancement
Currently, a pipeline like this is impossible:
This is due to Get-Content not releasing its file handle until the completion of its pipeline. Adding parentheses around the
Get-Content
call to force it to run to completion and collect output before proceeding does work but is a bit clunky, and there's no reason that this shouldn't also work:Currently this doesn't, because the file handle is still not released until the pipeline's completion, despite all the data being read at once. There are other caveats to using
-Raw
, but this would at least enable simple operations with less clunky syntax.Proposed technical implementation details (optional)
Something that will help alleviate the issue a little bit is some of the changes to command disposal behaviour in #9900, which causes commands to be disposed as soon as their role in the pipeline is fully completed (after
EndProcessing
/end{}
is completed).Beyond that, the rest of the changes can be implemented in Get-Content itself for -Raw, and it may be worth looking into whether there are other avenues for alleviating the issue.
For example, in Set-Content we can have a fallback behaviour whereby if it can't get a write handle on the file, it instead writes to a temporary file, and during EndProcessing() or with a Dispose() step it then copies the file over the original.
The text was updated successfully, but these errors were encountered: