New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
await Promise with Timer behavior regress from previous version #21515
Comments
If you change the file which is forked ( function waitMs(ms) {
return new Promise(c => setTimeout(c, ms))
}
async function sendAck() {
let limit = 10;
let latestPromise;
for (let i = 0; i < limit; i++) {
try {
latestPromise = (latestPromise || waitMs(1000)).then(() => {
console.log(`Sent ack ${i} ${Date.now()}`);
return waitMs(1000);
});
} catch (error) {
console.log(error);
}
}
}
sendAck(); The issue still repro's. So I think this is more an issue of Output of Electron forking a NodeJS process:
Output of a NodeJS process forking a NodeJS process:
|
It may even be as specific as an issue with Promises in conjunction with timeouts. If I change function syncCodeThatTakesALongTime() {
for (let i = 0; i < 10000000; i++) {
Math.random() * Math.random() * Math.random() * Math.random();
Math.random() * Math.random() * Math.random() * Math.random();
Math.random() * Math.random() * Math.random() * Math.random();
Math.random() * Math.random() * Math.random() * Math.random();
Math.random() * Math.random() * Math.random() * Math.random();
}
}
function waitMs(ms) {
return new Promise((resolve, reject) => {
syncCodeThatTakesALongTime();
resolve();
})
}
async function sendAck() {
let limit = 10;
let latestPromise;
for (let i = 0; i < limit; i++) {
try {
latestPromise = (latestPromise || waitMs(1000)).then(() => {
console.log(`Sent ack ${i} ${Date.now()}`);
return waitMs(1000);
});
} catch (error) {
console.log(error);
}
}
}
sendAck(); |
One more interesting thing. If I rewrite function waitMs(ms) {
return new Promise((resolve, reject) => {
let id = setInterval(() => {
/**
* Clearing interval here causes the bug to reproduce.
*/
// clearInterval(id);
resolve(id);
/**
* Clearing interval here causes the bug to reproduce.
*/
// clearInterval(id);
}, ms);
});
}
async function sendAck() {
let limit = 10;
for (let i = 0; i < limit; i++) {
try {
let id = await waitMs(1000);
/**
* Clearing interval here does not cause the bug to reproduce.
*/
clearInterval(id);
console.log(`Sent ack ${i} ${Date.now()}`);
} catch (error) {
console.log(error);
}
}
}
sendAck(); |
Sorry, I know I keep adding additional info, but I'm just trying to give as many hints as I can. If you do a function waitMs(ms) {
return new Promise(async (resolve, reject) => {
setTimeout(resolve, ms);
});
}
async function sendAck() {
let limit = 5000;
for (let i = 0; i < limit; i++) {
try {
await waitMs(500);
console.log(`Sent ack ${i}`);
} catch (error) {
console.log(error);
}
}
}
setInterval(function() {
console.log("timer that keeps nodejs processing running");
}, 10);
sendAck(); Weird! |
@MRayermannMSFT no, i actually seriously appreciate all the info! i'll try to tackle this. For some context on our side: |
Hey @codebytere , do you have any thoughts on when y'all might be getting to this? Also, I tried disabling/enabling EDIT: after double checking, it doesn't look like Electron is letting me change the value of ELECTRON_RUN_AS_NODE. |
@codebytere Do you have any update to share? This issue is the only blocking issue to us upgrading to Electron 7, which will fix a lot of our own bugs, and we'd really like to do that. |
i can try to get to this soon - i see from above though that there are some potential workarounds, though. Do none of those work enough to unblock this? |
I don't have any breaking news on this, but to report in for completeness: I did a little poking around on this yesterday to track down when this broke, in the hopes that it might be something simple. Unfortunately, the breakpoint is exactly what one would've expected -- the repro exits prematurely in 3ddd917 (Dec 4 2018) but works fine in 83d951d (Nov 30 2018) -- that is, it broke when we upgraded to 70.0.3538.110 for Electron 5, most likely with the node bump in 0adbcfd |
Agreed! As long as we're using ELECTRON_RUN_AS_NODE, we don't even need child_process.fork(). $ ELECTRON_RUN_AS_NODE=1 electron awaitPromise.js Is enough to trigger the bug, leaving As a baseline, $ node awaitPromise.js works correctly. |
@ckerr Is there a way to disable ELECTRON_RUN_AS_NODE for a child process? I wasn't able to do it, but maybe I was just doing it wrong? |
@MRayermannMSFT no - not at present. It's hardcoded into our logic and not overridable. |
I tried locally reverting back our 4-2-x --> 5-1-x changes to So this afternoon has managed to cross off node_main.cc and child_process off of the suspect list. I'd be happier knowing what it is rather than what it isn't, but we can keep digging |
Extra fun: I tried patching a stock version of Node with Electron's patchset, and building that as a standalone version of Node, and testing the repro with that. That works just fine. So it's not triggered by just our patches, nor just by our node_main.cc, by themselves... |
The mystery deepens! Please let us know if there's anything we can do to help @ckerr , and thank you for the continued updates. 🙂 |
I'm having quite a severe problem with async/await in Electron (even without timers and I've tried versions 4 to 8 and they all have the same issue) I just added this code to globalShortcut.register("F6", async () => {
console.log("F6 is pressed");
console.time("anAsyncFunctionThatResolvesImmediately");
await anAsyncFunctionThatResolvesImmediately();
console.timeEnd("anAsyncFunctionThatResolvesImmediately");
});
async function anAsyncFunctionThatResolvesImmediately() {
return;
} This is the console output when I press F6 multiple times (it takes an extremely long time to finish)
Update: This problem seems to be gone when you're out of Fix: globalShortcut.register("F6", () => {
setTimeout(handlePress, 0) // To get out of this function's context
});
async function handlePress() {
console.log("F6 is pressed");
console.time("anAsyncFunctionThatResolvesImmediately");
await anAsyncFunctionThatResolvesImmediately();
console.timeEnd("anAsyncFunctionThatResolvesImmediately");
}
async function anAsyncFunctionThatResolvesImmediately() {
return;
} However, I'd be lying if I said I understood what's happening here :) Can someone please explain WTF is going on here? |
I'm not sure but it could be related with #19468 - it seems to freeze the whole Electron app. |
This also repro's in Electron 8.0.2. |
@ckerr any update on if the team has made progress on this bug? And for what it's worth, this repro's on Electron 9.0.0-beta.12. |
For anyone interested in a workaround, from my tests it seems that wrapping the This issue is really weird... |
@ckerr is there anything we can do to help y'all with this? Have you been thinking about this anymore since Feb? |
@MRayermannMSFT yes, I got blocked on this and have come back it; sorry for the silence. I may have found the source of the issue and will report back tomorrow or possibly later today after I've done a little more digging to test my theory. |
No worries! I am excited to learn what it is. This is obviously one of those extra special bugs. 😋 |
I'll write up the status of the bug hunt in the next comment -- and I still have some more digging to do there -- but @MRayermannMSFT first I want to get you unblocked in a way that doesn't rely on e.g. setImmediate. Here's a workaround that WfM in testing. The issue arises iff control flow returns to node at any point when there's not a pending timeout. This happens in your code when first timeout returns a new If you create a sibling Promise-based timeout that runs parallel to the real one but fires out of step with it, that ensures that there's always at least one pending timeout and that's enough to keep everything alive. I paraphrased your const waitMs = (msec) => new Promise((resolve) => setTimeout(resolve, msec));
const intervalMsec = 1000;
const numIterations = 10;
let iteration = 0;
function doWork() {
++iteration;
console.log(`doing task #${iteration}`);
}
async function sendAck() {
let workPromise;
for (let i = 0; i < numIterations; i++) {
try {
workPromise = (workPromise || waitMs(intervalMsec)).then(() => {
doWork();
return waitMs(intervalMsec);
});
} catch (error) {
console.log(error);
}
}
}
sendAck(); When I run this with
Here's that code with a sibling Promise added: const waitMs = (msec) => new Promise((resolve) => setTimeout(resolve, msec));
const intervalMsec = 1000;
const numIterations = 10;
let iteration = 0;
function doWork() {
++iteration;
console.log(`doing task #${iteration}`);
}
async function sendAck() {
// tracingPromise works around a bug introduced in Electron 5.
// It's here solely to keep workPromise alive. Can be removed when
// https://github.com/electron/electron/issues/21515 is fixed upstream
let tracingPromise;
let workPromise;
for (let i = 0; i < numIterations; i++) {
try {
// NB: tracingPromise must not fire at the same time as workPromise,
// which is why the initial interval is halved
tracingPromise = (tracingPromise || waitMs(intervalMsec/2)).then(() => {
return waitMs(intervalMsec);
});
workPromise = (workPromise || waitMs(intervalMsec)).then(() => {
doWork();
return waitMs(intervalMsec);
});
} catch (error) {
console.log(error);
}
}
}
sendAck(); Under the same test conditions, this makes it all the way to
|
The issue is arising in timer code that was introduced in Node.js 12. The Node.js function Local<Function> cb = env->timers_callback_function();
MaybeLocal<Value> ret;
Local<Value> arg = env->GetNow();
// This code will loop until all currently due timers will process. It is
// impossible for us to end up in an infinite loop due to how the JS-side
// is structured.
do {
TryCatchScope try_catch(env);
try_catch.SetVerbose(true);
ret = cb->Call(env->context(), process, 1, &arg);
} while (ret.IsEmpty() && env->can_call_into_js());
... // code omitted for clarity
// To allow for less JS-C++ boundary crossing, the value returned from JS
// serves a few purposes:
// 1. If it's 0, no more timers exist and the handle should be unrefed
// 2. If it's > 0, the value represents the next timer's expiry and there
// is at least one timer remaining that is refed.
// 3. If it's < 0, the absolute value represents the next timer's expiry
// and there are no timers that are refed.
int64_t expiry_ms =
ret.ToLocalChecked()->IntegerValue(env->context()).FromJust();
Where function processTimers(now) {
debug('process timer lists %d', now);
nextExpiry = Infinity;
let list;
let ranAtLeastOneList = false;
while (list = timerListQueue.peek()) {
if (list.expiry > now) {
nextExpiry = list.expiry;
return refCount > 0 ? nextExpiry : -nextExpiry;
}
if (ranAtLeastOneList)
runNextTicks();
else
ranAtLeastOneList = true;
listOnTimeout(list, now);
}
return 0;
} As mentioned in the previous comment, the issue arises when the last timeout has fired and there is currently no timeout but there is one that will be created inside a Promise So what's happening is:
That's the state of the bug. Next, I need to find out why this only happens in Electron-running-as-Node.js and not in Node.js itself. Two possible theories: (1) This is a bug in Node.js too, but some other handle reference somewhere keeps things alive long enough to work, or maybe (2) calling |
Here's the one-liner that fixes itdiff --git a/atom/app/node_main.cc b/atom/app/node_main.cc
index c4dde00f9..98dd52dee 100644
--- a/atom/app/node_main.cc
+++ b/atom/app/node_main.cc
@@ -52,6 +52,7 @@ int NodeMain(int argc, char* argv[]) {
// Initialize gin::IsolateHolder.
JavascriptEnvironment gin_env(loop);
+ gin_env.isolate()->SetMicrotasksPolicy(v8::MicrotasksPolicy::kExplicit);
// Explicitly register electron's builtin modules.
NodeBindings::RegisterBuiltinModules(); ExplanationFollowing where the last comment left off, here's how
The important part is that bool run_microtasks =
microtask_queue && microtask_queue->size() &&
!microtask_queue->HasMicrotasksSuppressions() &&
microtask_queue->microtasks_policy() == v8::MicrotasksPolicy::kAuto; This code block is slightly different in Node 12 than Electron 5 because the former uses v8 bool run_microtasks =
default_microtask_queue()->size() &&
!default_microtask_queue()->HasMicrotasksSuppressions() &&
handle_scope_implementer()->microtasks_policy() ==
v8::MicrotasksPolicy::kAuto;
explicit HandleScopeImplementer(Isolate* isolate)
: isolate_(isolate),
spare_(nullptr),
call_depth_(0),
microtasks_policy_(v8::MicrotasksPolicy::kAuto),
last_handle_before_deferred_block_(nullptr) {
} We override that in Electron 5's if (browser_env_ == BROWSER) {
// SetAutorunMicrotasks is no longer called in node::CreateEnvironment
// so instead call it here to match expected node behavior
context->GetIsolate()->SetMicrotasksPolicy(v8::MicrotasksPolicy::kExplicit);
} else {
// Node uses the deprecated SetAutorunMicrotasks(false) mode, we should
// switch to use the scoped policy to match blink's behavior.
context->GetIsolate()->SetMicrotasksPolicy(v8::MicrotasksPolicy::kScoped);
} For its part, Node 12 sets the value to void SetIsolateUpForNode(v8::Isolate* isolate) {
isolate->AddMessageListenerWithErrorLevel(
OnMessage,
Isolate::MessageErrorLevel::kMessageError |
Isolate::MessageErrorLevel::kMessageWarning);
isolate->SetAbortOnUncaughtExceptionCallback(ShouldAbortOnUncaughtException);
isolate->SetMicrotasksPolicy(MicrotasksPolicy::kExplicit);
isolate->SetFatalErrorHandler(OnFatalError);
isolate->SetAllowWasmCodeGenerationCallback(AllowWasmCodeGenerationCallback);
isolate->SetPromiseRejectCallback(task_queue::PromiseRejectCallback);
v8::CpuProfiler::UseDetailedSourcePositionsForProfiling(isolate);
}
|
Nice! I think I'm going to need to read through all of this 2 or 3 more times to fully understand things. Great write-up and thanks again for returning to this. |
I just tested against Electron 8.2.4. I can confirm that our issue no longer repros. Many thanks again to @ckerr and @codebytere! 🥰 |
Preflight Checklist
Issue Details
Expected Behavior
We have been using the piece of code to perform asynchronous operations.
It would pause the execution of the code path containing it for about 1 second and resume it then. This has been working well for a long time.
Actual Behavior
Since we upgrade to Electron 7, in forked node processes, only the first several timer callbacks are called on time, while the rest of them are called extremely late for unknown reason.
Suppose we call the above piece of code for 10 times in a loop. Only the first two promises are created and resolved correctly. In addition, the process exits for unknown reason. In our production code, since there are other things going on (not computationally intensive) which keeps the process alive, we are seeing the 3rd and the rest of the Timers being created and invoked unreasonably late (around 12s after the 2nd one resolves).
The output in the renderer console is
To Reproduce
https://github.com/JasonYeMSFT/electron-quick-start/tree/bug/await-Promise
To run the repro steps
Screenshots
Additional Information
Electron 4.2.5 behaves correctly for exactly the same code in this repo. The behavior doesn't reproduce in renderer process. Also it doesn't seem to be a nodejs problem because I can run the renderer.js directly with node on the same version and it behaves correctly.
I did some extra experiment and it turns out that
setInterval
is producing much more stable behaviors compared tosetTimeout
(update, I was wrong,setInterval
also has the weird behavior).The text was updated successfully, but these errors were encountered: