Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orphaned child processes keep running when exiting lerna run with an error #2284

Closed
justinmchase opened this issue Sep 28, 2019 · 38 comments
Closed

Comments

@justinmchase
Copy link

I'm running the command lerna run --stream --parallel start and if one of the apps has an error the others can be left running even as the lerna process exits.

Expected Behavior

I expect all child processes to be killed when lerna exits.

Current Behavior

Lerna orphans processes

Possible Solution

I'm not familiar enough with the code :(

Steps to Reproduce (for bugs)

  1. Create the default project
  2. Create two apps, one is a simple nodejs express app, the other try exit 1 in the "start" script in package.json
  3. lerna run --stream --parallel start

The fact that the nodejs app starts fine but the other app exits is causing the node.exe process to be orphaned.

{
  "packages": [
    "apps/**"
  ],
  "version": "0.0.0"
}

Context

I just want to call npm start and have it start multiple apps in my mono repo. Then press Ctrl+C to stop them all, repeat.

Your Environment

I am running on windows, I have seen this behavior in both WSL and git bash. One of my apps just has docker-compose up as the script which starts up some containers like my db. If docker isn't running then this crashes for this reason. Also my app runs ngrok which also becomes orphaned.

Executable Version
lerna --version 3.15.0
npm --version 6.10.2
yarn --version N/A
node --version v12.8.0
OS Version
Windows 10 Pro 1903
@evocateur evocateur added the os: windows Issues that can only be replicated on Windows label Oct 7, 2019
@evocateur
Copy link
Member

Seems like we should be using signal-exit somewhere in @lerna/child-process to communicate signals from the parent process...

@evocateur evocateur added bug and removed os: windows Issues that can only be replicated on Windows labels Oct 7, 2019
@evocateur
Copy link
Member

Then again, execa should already be effectively doing this...

So I'm kinda stumped.

@justinmchase
Copy link
Author

Do you know if detached can ever be true from the cli?
https://github.com/sindresorhus/execa/blob/c8dccf7de66c65f4b9b821ec00871fea386fb35f/index.js#L61

It seems like if its not detached then this should not be possible or its a lower level, nodejs issue.

@evocateur
Copy link
Member

No, we never set detached for any invocation of execa (we're still on 1.0, though).

@justinmchase
Copy link
Author

Are you able to reproduce it from the explanation or should I attach some code? I wonder if npm is itself spawning a detached process...

@evocateur
Copy link
Member

I don't use docker or Windows, so no, I can't reproduce it.

@cajones314
Copy link

cajones314 commented Jan 9, 2020

Just an FYI, we are having the same issue with Lerna 3.20. I don't know how too recreate, but if things slow down I will try to troubleshoot.

@john681611
Copy link

Same issue seen here

Executable Version
lerna --version 3.20.2
npm --version 6.13.4
yarn --version 1.22.4
node --version v12.16.1
OS Version
Elementary Os 5.1.2

@eyelidlessness
Copy link

eyelidlessness commented May 5, 2020

I'm seeing this issue with a process that detects interrupt signals and requires confirmation to exit. The process continues to run until it completes, but lerna exits immediately. It seems like these signals should be sent to child processes for them to handle. (In my case I am running the command with a single package scope, no parallel commands, so there's no risk of some sort of partial interrupt.)

@justinmchase
Copy link
Author

That matches what I was seeing also, I had code in there to handle process.on('SIGINT', ...) and it had a bug in it and seemed to cause the processes which may still have an active event loop to hang open

@justinmchase
Copy link
Author

justinmchase commented May 9, 2020

I've dug into this pretty deeply and I have a simple reproduction finally, the issue linked above has all of the steps but there is one extra for Lerna:

Follow the Steps here:
apollographql/apollo-server#4097

Including uncommenting the line of code to call await stop(). Once the server restarts the child process becomes orphaned, then do ctrl+c in your terminal to kill Lerna .

Observe in the browser that network requests continue successfully while the process should be shut down entirely.

Restarting Lerna with npm start causes everything to startup fine but no network requests are recieved, a background orphaned process is receiving them. If you change the behavior of the code at all, the results of the network requests appear unchanged.

So the process chain is something like:

npm -> lerna -> npm -> tsc-watch -> node

The bottom most node process becomes orphaned when it doesn't shut down after receiving the SIGTERM signal from tsc-watch and then later when lerna is killed it doesn't ensure that all child processes are also shut down.

@ManiacDC
Copy link

We just ran into this and have figured out the conditions for reproducing:

  1. Running on Windows (I launched lerna from Git Bash, I'm not sure whether this matters)
  2. An npm script that loops. I can reproduce with 'tsc -w'
  3. The script must be executed via a sh or bash shell, either by changing NPM's script-shell OR changing the script to be like: "bash -c 'tsc -w'". This does not occur with cmd shell that npm defaults to on Windows.
  4. the Script must be executed via lerna, like: lerna run tscw --stream --scope my-package

This is a major roadblock to our team's work as we have people running on multiple platforms and are trying to standardize our npm scripts to all run in bash.

@ManiacDC
Copy link

This is definitely an issue with execa. I was able to reproduce this with both 1.0.0 and 4.0.2 in a standalone program. It may or may not be related to the issue I linked earlier.

@ManiacDC
Copy link

I have submitted a new execa issue for this:
sindresorhus/execa#433

When this is fixed, this will likely require that lerna update to a newer version of execa.

@ManiacDC
Copy link

@evocateur please take a look at sindresorhus/execa#433 when you get a chance. Looks like there are some options for this.

@frags51
Copy link

frags51 commented May 18, 2021

Is there some workaround for this in the meanwhile, based on sindresorhus/execa#433?

@ManiacDC
Copy link

Is there some workaround for this in the meanwhile, based on sindresorhus/execa#433?

Our workaround was to NOT change NPM's script-shell, leaving it at the platform default. Then we change incompatible scripts on an as-needed basis using "bash -c '<script here>'" and pray we don't run into issues.

@justinmchase
Copy link
Author

In the latest version of lerna I've been having success using WSL and in my npm start script I'm doing something like:

{
  "scripts": {
    "start": "./start.sh"
  }
}

Then in the start script I'm using a trap to kill child processes:

#!/bin/bash
# start.sh
onexit() {
  for p in "${pids[@]}" ; do
    kill "$p";
  done
}

# Lerna sends a signal which the trap receives
trap onexit EXIT
pids=()

# Run as child processes
example &
pids+=($!)
# ...

# Wait for all child processes to exit
wait

@frags51
Copy link

frags51 commented May 19, 2021

I see. Was hoping there was some solution without WSL/Bash :\

@Ivan-Parushev
Copy link

In the latest version of lerna I've been having success using WSL and in my npm start script I'm doing something like:

{
  "scripts": {
    "start": "./start.sh"
  }
}

Then in the start script I'm using a trap to kill child processes:

#!/bin/bash
# start.sh
onexit() {
  for p in "${pids[@]}" ; do
    kill "$p";
  done
}

# Lerna sends a signal which the trap receives
trap onexit EXIT
pids=()

# Run as child processes
example &
pids+=($!)
# ...

# Wait for all child processes to exit
wait

Can you please share the full script? I don't understand how you call the "lerna run --scope=packageName --parallel start" command.

@justinmchase
Copy link
Author

Oh sorry, in the root directory I have a package.json there, something like:

{
  "scripts": {
     "start": "lerna run --parallel start"
   },
   "dependencies": {
    "lerna": "...",
   }
 }

Then the standard lerna package structure:

package.json
packages/example/package.json
packages/example/start.sh

Also I refined this a little so you don't need to accumulate the pids manually with this:

#!/bin/bash
function onclose() 
{
  for pid in $(jobs -p)
  do
      kill -9 ${pid} 2>/dev/null || echo -e ""
      wait ${pid} 2>/dev/null
  done
}
trap "onclose" INT TERM EXIT

examplea &
exampleb &
examplec &

@justinmchase
Copy link
Author

@frags51 Was hoping there was some solution without WSL/Bash

Yeah sorry there is probably a similar solution for non-bash, where you defer to a scripting language of your choice and then trap exit events and then close child processes explicitly. Thats the idea.

Here is how you would attach to the exiting event in Powershell for example:
https://stackoverflow.com/questions/2436510/powershell-profile-on-exit-event

Also, maybe this isn't the advice you're looking for but I basically just gave up on non-bash scripting environments and decided to just always use bash from now, even on windows, and its an enormously simplifying move. Easier said than done in old code bases but I try to refactor them as soon as possible to get away from it. If you're reluctant and you're telling yourself "but bash sucks!"... Yes, you're right it does suck :) But they all suck, really badly. It doesn't matter. What sucks the least is one sucky shell script rather than more than one. So I just gave in and picked the one that was the most portable.

@frags51
Copy link

frags51 commented Aug 10, 2021

@frags51 Was hoping there was some solution without WSL/Bash

Ah :/ Thanks for the suggestion though.

@magalhas
Copy link

magalhas commented Sep 6, 2021

Is this getting solved? Facing the same issue using Windows + MINGW64

@justinbhopper
Copy link

@magalhas This is likely not getting solved. Lerna is an abandoned project.

@meitix
Copy link

meitix commented Sep 11, 2021

I saw this issue's fix PR has been merged but not released, when is the next release, BTW last release was 7 months ago :|

@yinzara
Copy link

yinzara commented Oct 7, 2021

I saw this issue's fix PR has been merged but not released, when is the next release, BTW last release was 7 months ago :|

We tested this version with our repository and it caused other issues of the process continually respanwning.

Downgrading to Lerna 3.22.1 seems to fix this issue. For now anyone who wants to use Lerna in a Windows environment should probably lock their dependency to that version.

@mscottx88
Copy link

Confirmed. Downgrading to 3.22.1 resolved this issue regarding the hanging child processes. Thanks!

@enchorb
Copy link

enchorb commented Dec 12, 2021

Was stuck on this for 4 hours. Express server kept giving EADDRINUSE errors since Lerna wasn't terminating the Node server child processes...3.22.1 downgrade has resolved this

@abarke
Copy link

abarke commented Jan 6, 2022

I saw this issue's fix PR has been merged but not released, when is the next release, BTW last release was 7 months ago :|

We tested this version with our repository and it caused other issues of the process continually respanwning.

Downgrading to Lerna 3.22.1 seems to fix this issue. For now anyone who wants to use Lerna in a Windows environment should probably lock their dependency to that version.

I'm using lerna run --parallel dev to run a local nodejs dev server and vite in GitBash windows. The problem for me what that vite was never killed and continued to block the TCP port.

Reverting to 3.22.1 also fixed it for me. Shame this is not fixed. What are the alternatives to Lerna if it's no longer maintained?

@lgh06
Copy link

lgh06 commented Jan 16, 2022

same issue here, windows 10, lerna 4.0.0, parallel.

@fahrulseptiana
Copy link

any update? facing the issue on windows 11 with lerna 4.0.0

@YarekTyshchenko
Copy link

YarekTyshchenko commented Mar 16, 2022

Yep, replicated with Windows 11 and lerna v4.0.0. Just with lerna run <task>

@justinmchase
Copy link
Author

I'm guessing this repo has gone dark essentially in favor of the built-in "workspaces" feature in npm now:

https://docs.npmjs.com/cli/v8/using-npm/workspaces

@jonathanbyrne
Copy link

Same here, I'm also having this issue with Windows 10 or 11 and Lerna v4.0.0.

@jgramling01
Copy link

jgramling01 commented May 13, 2022

Same here, considering Lerna is changing hands, maybe we can revive this issue @nrwl ?

@JamesHenry
Copy link
Member

Thanks all!

This should hopefully be fixed by @feryardiant change, applied in #3156

@justinmchase
Copy link
Author

We waited literally 3 years for that 1 line fix 🤣

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests