Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hanging Chromium processes #752

Open
aymericbeaumet opened this issue Feb 12, 2021 · 11 comments
Open

Hanging Chromium processes #752

aymericbeaumet opened this issue Feb 12, 2021 · 11 comments

Comments

@aymericbeaumet
Copy link

What versions are you running?

$ go list -m github.com/chromedp/chromedp
github.com/chromedp/chromedp v0.6.5
$ chromium --version
Chromium 89.0.4388.0
$ go version
go version go1.15.7 darwin/amd64

What did you do? Include clear steps.

I'm running this simple program:

func main() {
	ctx, cancel := chromedp.NewContext(context.Background())
	defer cancel()

	if err := chromedp.Run(
		ctx,
		chromedp.Navigate("https://github.com"),
	); err != nil {
		panic(err)
	}
}

What did you expect to see?

The Chromium processes should be killed when the Go process stops.

What did you see instead?

The number of Chromium processes grows after each time I run the Go program.

@mkalus
Copy link

mkalus commented Feb 16, 2021

I can second this behavior. I have written a screenshot service based on chromedp (https://github.com/mkalus/goggler) and experience the same problem when running the Docker image.

Try the following:

docker run -d --rm -p8080:8080 --name goggler ronix/goggler
# before:
docker exec goggler ps -Af
# get image
curl -o /dev/null http://localhost:8080/?url=https://www.google.com/
# after
docker exec goggler ps -Af

The last column will look something like that:

UID          PID    PPID  C STIME TTY          TIME CMD
root           1       0  0 13:58 ?        00:00:00 /opt/google/chrome/goggler
root          28       1  0 13:59 ?        00:00:00 [cat] <defunct>
root          29       1  0 13:59 ?        00:00:00 [cat] <defunct>
root          31       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          32       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          44       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          50       1  0 13:59 ?        00:00:00 [chrome] <defunct>
root          72       0  0 14:01 ?        00:00:00 ps -Af

I cannot get rid of the zombies without killing the parent process which shuts down the container of course.

@aymericbeaumet
Copy link
Author

@mkalus I've written a blog post explaining how to mitigate the issue: https://aymericbeaumet.com/prevent-chromedp-chromium-zombie-processes-from-stacking.

TLDR

func main() {
	ctx, cancel := chromedp.NewContext(context.Background())
	defer func() {
		cancel()
		// Prevent Chromium processes from hanging
		if _, err := exec.Command("pkill", "-g", "0", "Chromium").Output(); err != nil {
			log.Println("[warn] Failed to kill Chromium processes", err)
		}
	}()

	// ...
}

@mkalus
Copy link

mkalus commented Feb 17, 2021

I have read your blog post and tried your code, but in my case (within the Docker container), the zombie processes cannot be killed without killing the main process (PID 1). Moreover, just killing off Chromiums would do harm since multiple go routines might have spawned Chromium processes and just killing all will lead to errors.

As a consequence, I am looking for another solution, hopefully one which can be done in the code.

@mkalus
Copy link

mkalus commented Feb 17, 2021

Thanks to you @aymericbeaumet, I have had another look and found a solution described in https://github.com/chromedp/docker-headless-shell

I need to initialize my container using dumb-init or tini to get rid of zombie processes. Thanks for pushing me to think again ;-)

@ghost
Copy link

ghost commented Mar 21, 2021

Issue #774 should've been given as a comment here.

fabio42 pushed a commit to fabio42/homebrew-cask that referenced this issue Apr 30, 2021
This simple shell wrapper is provided to execute chromium from command line.
The way it is defined, chromium will have the PID `/bin/sh` as PPID:

```
  501 77821  1088   0  6:24pm ttys028    0:01.53 -zsh
  501 78177 77821   0  6:24pm ttys028    0:00.01 /bin/sh /usr/local/bin/chromium
  501 78181 78177   0  6:24pm ttys028    0:01.83 /Applications/Chromium.app/Contents/MacOS/Chromium
```

I'm using chromedp to interact with `chromium/Chrome`, and this
behaviour triggers some issues. When the execution context is done in
the library, it triggers the termination of the shell process instead of the chromium process and
chromium is left as an orphan process on the system.

There is an open issue documenting this behaviour [here](chromedp/chromedp#752)

Using `exec` will detach `chomium` process from the shell, and will provide
expected behaviour when called with such libraries; user
experience when calling `chromium` from the shell will not change.

```
  501 77821  1088   0  6:24pm ttys028    0:01.61 -zsh
  501 93092 77821   0  6:26pm ttys028    0:01.40 /Applications/Chromium.app/Contents/MacOS/Chromium
```

Thank you.
@fabio42
Copy link
Contributor

fabio42 commented May 1, 2021

So I ran into the same issue today. I don't really liked the pkill approach for my own use case, so I started to look into the code, specifically how chromedp discover browser exec path.

While I was looking at this I realized that in OSX, when you install chromium through brew, chromium's cask also deploy a small wrapper in /usr/local/bin/chromium. This is this wrapper that is discovered by chromedp findExecPath function.
Thing is, the way the wrapper is made, it start a shell with chromium as child process. When context is done, the shell is terminated, leaving an orphan chromium process on the system.

I just submitted this PR to the brew project, which solve this behaviour and I hope will be accepted as it solve the issue at its source.

If you are running into this issue with this same setup ... it is likely the root cause and you don't have to use pkill, the problem is not on chromedp neither on chromium.

Until then, I ended implementing my own findExecPath and feed the browser path at context creation. I'm wondering what the project think about this and if something like this should be implemented in the main library :

func findExecPath() string {
	var p []string
	switch runtime.GOOS{
	case "darwin":
		p = []string{
                         // Mac
			"/Applications/Chromium.app/Contents/MacOS/Chromium",
			"/Applications/Google Chrome.app/Contents/MacOS/Google Chrome",
		}
	case "windows":
		p = []string{
			// Windows
			"chrome",
			"chrome.exe", // in case PATHEXT is misconfigured
			`C:\Program Files (x86)\Google\Chrome\Application\chrome.exe`,
			`C:\Program Files\Google\Chrome\Application\chrome.exe`,
			filepath.Join(os.Getenv("USERPROFILE"), `AppData\Local\Google\Chrome\Application\chrome.exe`),
		}
	default:
		p = []string{
			// Unix-like
			"headless_shell",
			"headless-shell",
			"chromium",
			"chromium-browser",
			"google-chrome",
			"google-chrome-stable",
			"google-chrome-beta",
			"google-chrome-unstable",
			"/usr/bin/google-chrome",
		}
	}

	for _, path := range p {
		found, err := exec.LookPath(path)
		if err == nil {
			return found
		}
	}
	// Fall back to something simple and sensible, to give a useful error
	// message.
	return "google-chrome"
}

I can provide a PR if that's something you would be interested in.
Thanks for this fantastic library !
Cheers

@ZekeLu
Copy link
Member

ZekeLu commented May 1, 2021

@fabio42 Good job! A PR is always welcome! Even if homebrew fixed its issue, this change will reduce the call to exec.LookPath. The concern is that it may break some use cases (for example, a Windows user who just has chromium installed). But I think it's okay since a user can always specify the browser path with chromedp.ExecPath.

And please note that this is just one of the reasons that the browser processes do not terminated. The root cause is different from that of zombies in a container.

Thank you!

fabio42 added a commit to fabio42/chromedp that referenced this issue May 3, 2021
This change provides a slightly more efficient way of
discovering the `chrome` binary path. It also solves
a discovery issue on `darwin` OS when `chromium` is
installed through `homebrew` chromedp#752.
@fabio42
Copy link
Contributor

fabio42 commented May 3, 2021

Thank you for your feedback @ZekeLu. Just opened a PR #811 for this change.

I agree the container issue looks indeed different.
My understanding is that behavior is expected, the container seems to handle ENTRYPOINT/CMD as the init process by default (https://docs.docker.com/config/containers/multi-service_container/).
As @mkalus mentioned the simplest way to address it is to use the Docker provided --init option.

miccal added a commit to Homebrew/homebrew-cask that referenced this issue May 4, 2021
#104850)

* When invoked from command line use exec to replace shell

This simple shell wrapper is provided to execute chromium from command line.
The way it is defined, chromium will have the PID `/bin/sh` as PPID:

```
  501 77821  1088   0  6:24pm ttys028    0:01.53 -zsh
  501 78177 77821   0  6:24pm ttys028    0:00.01 /bin/sh /usr/local/bin/chromium
  501 78181 78177   0  6:24pm ttys028    0:01.83 /Applications/Chromium.app/Contents/MacOS/Chromium
```

I'm using chromedp to interact with `chromium/Chrome`, and this
behaviour triggers some issues. When the execution context is done in
the library, it triggers the termination of the shell process instead of the chromium process and
chromium is left as an orphan process on the system.

There is an open issue documenting this behaviour [here](chromedp/chromedp#752)

Using `exec` will detach `chomium` process from the shell, and will provide
expected behaviour when called with such libraries; user
experience when calling `chromium` from the shell will not change.

```
  501 77821  1088   0  6:24pm ttys028    0:01.61 -zsh
  501 93092 77821   0  6:26pm ttys028    0:01.40 /Applications/Chromium.app/Contents/MacOS/Chromium
```

Thank you.

* Update chromium.rb

Co-authored-by: Fabrice Bessettes <fbessett@cisco.com>
Co-authored-by: Miccal Matthews <miccal.matthews@gmail.com>
ZekeLu pushed a commit that referenced this issue May 4, 2021
…811)

This change provides a slightly more efficient way of
discovering the `chrome` binary path. It also solves
a discovery issue on `darwin` OS when `chromium` is
installed through `homebrew` #752.
@gulien
Copy link

gulien commented Aug 5, 2023

Hello,

Some users have reported memory leaks due to accumulating Chrome processes.

While I'm not entirely certain about the root cause, I'm hoping someone can shed some light on this.

Context: Linux container, Chrome v115.x, chromedp v0.9.1, amd64 platform.

  1. Zombies:

    We use tini as PID 1 to reap zombies processes. Furthermore, chromedp waits for the command to complete, so having zombies processes is strange.

    However, and please correct me if I'm mistaken, I believe there might be an exception when the context concludes. In this case, we might not wait for the command to finalize (be killed), leading to indefinitely hanging processes.

    For instance, adding cmd.Wait here could make the trick:

    select {
     case <-ctx.Done():
     	// TODO: do we care about this error in any scenario? if the
     	// user cancelled the context and killed chrome, this will most
     	// likely just be "signal: killed", which isn't interesting.
     	go cmd.Wait()
    
     	return nil, ctx.Err()
     case <-c.allocated: // for this browser's root context
    }
  2. Hanging Processes:

    This might correlate with the previous point. Some users have observed hanging processes that aren't zombies.

On a side note, I'm questioning whether cmd.SysProcAttr.Pdeathsig = syscall.SIGKILL is sufficient in a Linux setting. For context, an (older) blog post: https://medium.com/@felixge/killing-a-child-process-and-all-of-its-children-in-go-54079af94773 suggests an alternative approach:

 cmd := exec.Command("/bin/sh", "-c", "watch date > date.txt")
 cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true}
 // ...
 syscall.Kill(-cmd.Process.Pid, syscall.SIGKILL)

In both scenarios, chrome_crashpad and cat processes are the culprits.

@ZekeLu
Copy link
Member

ZekeLu commented Aug 8, 2023

Hi @gulien, as of now, starting and closing browser instances frequently has some known issues:

And it consumes more resources comparing to opening and closing a tab in an existing browser instance.

So, for now, I would recommend using a single browser instance. chromedp.NewContext shows how to use a single browser instance for multiple tasks.

@gulien
Copy link

gulien commented Aug 10, 2023

Thanks @ZekeLu! The recommended way used to be the other way around lol

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants