-
-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow passing generators to the std*
options
#693
Conversation
This is super cool and very useful 🎉 |
lib/stdio/generator.js
Outdated
// Therefore, there is no need to allow Node.js or web transform streams. | ||
// The `highWaterMark` is kept as the default value, since this is what `childProcess.std*` uses. | ||
// We ensure `objectMode` is `false` for better buffering. | ||
// Chunks are currently processed serially. We could add a `concurrency` option to parallelize in the future. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer a /* */
comment as it makes it easier to edit it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in daf5b9c
lib/stdio/generator.js
Outdated
// - The readable side propagate the generator's error | ||
// In order for the later to win that race, we need to wait one microtask. | ||
const destroyPassThrough = callbackify(async error => { | ||
await setTimeout(0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this use queueMicrotask
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Waiting for one microtask (queueMicrotask()
) works.
I am waiting for one macrotask instead (setTimeout()
) to be on the safe side, in case the Node.js core logic changes in the future. The race condition this is solving relies on multiple core Node.js functions' implementation (I had to dig in to understand what was going on :) ), specifically:
Readable.from()
's error handlingreadable.iterator()
's error handling- stream destroyer
The first two rely on async
/await
, so if the core Node.js source code added an additional await
in that logic for unrelated reasons, it is possible that it would impact that race condition.
To clarify the underlying issue: when the user-supplied generator throws, two errors are propagated in a race condition:
- The
Duplex
writable part (thePassThrough
) is aborted, which throws anAbortError
that is very generic and unhelpful for users - The
Duplex
readable part (wrapped byReadable.from()
) propagated the original error
So we are slowing down the first part by adding some delay to the PassThrough
's destroy()
method. This delays destroying the generator's underlying PassThrough
slightly longer, but that should not be an issue.
This PR includes a unit test for this specific case. However, I think we might want to be on the safe side and wait for one macrotask instead of a microtask? What do you think?
Side note: I have changed the code to use callbacks instead of async
/await
. I always avoid this in general, but in this specific case, the destroy()
method is callback-based, and all we are doing is adding a setTimeout()
, with no possible uncaught exceptions, so it does not seem necessary to introduce the indirection of util.callbackify()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order for the later to win that race, we need to wait one microtask.
The code comment is incorrect then, since it says "microtask". I would expand a little bit on the comment (as succinctly as possible) there on why a macrosecond is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 1f9205b 👍
The transform can `yield` either a `string` or an `Uint8Array`, regardless of the `chunks` argument's type. | ||
|
||
## Filtering | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users must be careful hereful here. Their secret could potentially be split over multiple chunks. This example would work better if the chunk was a line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, absolutely, great catch!
I did think of it as well. I have an upcoming PR which will provide with line-wise splitting for the generators chunks
. This will solve that problem.
In general, mapping text (which is most likely the main use case) is quite unhelpful without proper line-splitting. But I'm about to fix this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the following issue for it: #695
Co-authored-by: Sindre Sorhus <sindresorhus@gmail.com>
Co-authored-by: Sindre Sorhus <sindresorhus@gmail.com>
lib/stdio/generator.js
Outdated
import {setTimeout} from 'node:timers/promises'; | ||
|
||
/* | ||
* Generators can be used to transform/filter standard streams. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without the *
prefixes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed in 295f63a
Ready for another review! 👍 |
Fixed 👍 |
Fixes #20 (second most upvoted issue). Also fixes #21 and fixes #121 (third most upvoted issue).
This allows transforming a child process' input and ouput by passing an async generator to the
stdin
/stdout
/stderr
/stdio
option.Using generators is much simpler for users than creating web or Node.js
Transform
streams (which are used under-the-hood). Unlike regular functions, it still enables init and flushing logic (before and after thefor await
loop, which can be particularly helpful (and necessary for issues like #121).IMHO There are many use cases for this. I have been needed this feature myself in production several times. For example, I've been needed to use
stdout: 'inherit'
but filtering secret values. This is now easy by using the{ stdout: [filterSecretTransform, 'inherit'] }
option. I think the fact that those are among our most upvoted issues is telling.This PR intentionally currently leaves out two points for the moment:
Uint8Array
. Especially objects. Basically for the underlying stream to useobjectMode: true
. This is the issue at Allow transform streams convert output to an array of any objects, instead of a single string #21.For a fuller description of the feature from a user standpoint, I have created the following documentation.