Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return proper exit code for TERM signal, JRuby #1558

Closed
perlun opened this issue Apr 9, 2018 · 14 comments
Closed

Return proper exit code for TERM signal, JRuby #1558

perlun opened this issue Apr 9, 2018 · 14 comments
Labels

Comments

@perlun
Copy link
Contributor

perlun commented Apr 9, 2018

With #1337 merged and running a recent version of JRuby, I'm seeing rather strange behavior re. the process exit codes. All of the examples below are running on macOS.

I am using a trivial config.ru to illustrate the problem (curl http://localhost:9292 will trigger the termination):

run Proc.new { |env|
  Process.kill('TERM', Process.pid)

  ['200', {'Content-Type' => 'text/html'}, ['Terminating\'d']]
}

Ruby 2.5.1 (correct behavior)

On MRI (2.5.1), it works fine:

$ bundle exec rackup
Puma starting in single mode...
* Version 3.11.3 (ruby 2.5.1-p57), codename: Love Song
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://localhost:9292
Use Ctrl-C to stop
- Gracefully stopping, waiting for requests to finish
::1 - - [09/Apr/2018:08:06:18 +0300] "GET / HTTP/1.1" 200 - 0.0009
=== puma shutdown: 2018-04-09 08:06:18 +0300 ===
- Goodbye!
Terminated: 15
$ echo $?
143

143 = 128 + 15, which is the correct value.

JRuby 9.1.16.0 (incorrect behavior)

However, unlike what @dekellum was experiencing in #1337, I am seeing incorrect behavior on JRuby:

$ java -version
java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b16)
Java HotSpot(TM) 64-Bit Server VM (build 25.152-b16, mixed mode)
$ bundle exec rackup
Puma starting in single mode...
* Version 3.11.3 (jruby 9.1.16.0 - ruby 2.3.3), codename: Love Song
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://localhost:9292
Use Ctrl-C to stop
- Gracefully stopping, waiting for requests to finish
0:0:0:0:0:0:0:1 - - [09/Apr/2018:08:14:53 +0300] "GET / HTTP/1.1" 200 - 0.0127
- Gracefully stopping, waiting for requests to finish
=== puma shutdown: 2018-04-09 08:14:53 +0300 ===
- Goodbye!
=== puma shutdown: 2018-04-09 08:14:53 +0300 ===
- Goodbye!
SignalException: SIGTERM
  block in setup_signals at /Users/plundberg/.rvm/gems/jruby-9.1.16.0/gems/puma-3.11.3-java/lib/puma/launcher.rb:397
$ echo $?
1

Exit code 1 indicates "failure", and does not seem right in this case.

It's the same on Java 9:

$ echo $JAVA_OPTS
--add-opens java.base/java.util.zip=ALL-UNNAMED --add-opens java.base/java.security.cert=ALL-UNNAMED --add-opens java.base/java.security=ALL-UNNAMED --add-opens java.base/java.io=ALL-UNNAMED
$ java -version
java version "9.0.4"
Java(TM) SE Runtime Environment (build 9.0.4+11)
Java HotSpot(TM) 64-Bit Server VM (build 9.0.4+11, mixed mode)
$ bundle exec rackup
Puma starting in single mode...
* Version 3.11.3 (jruby 9.1.16.0 - ruby 2.3.3), codename: Love Song
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://localhost:9292
Use Ctrl-C to stop
- Gracefully stopping, waiting for requests to finish
0:0:0:0:0:0:0:1 - - [09/Apr/2018:08:15:54 +0300] "GET / HTTP/1.1" 200 - 0.0145
- Gracefully stopping, waiting for requests to finish
=== puma shutdown: 2018-04-09 08:15:54 +0300 ===
- Goodbye!
=== puma shutdown: 2018-04-09 08:15:54 +0300 ===
- Goodbye!
WARNING: An illegal reflective access operation has occurred
WARNING: Illegal reflective access by jnr.posix.JavaLibCHelper$ReflectiveAccess to method sun.nio.ch.SelChImpl.getFD()
WARNING: Please consider reporting this to the maintainers of jnr.posix.JavaLibCHelper$ReflectiveAccess
WARNING: Use --illegal-access=warn to enable warnings of further illegal reflective access operations
WARNING: All illegal access operations will be denied in a future release
SignalException: SIGTERM
  block in setup_signals at /Users/plundberg/.rvm/gems/jruby-9.1.16.0/gems/puma-3.11.3-java/lib/puma/launcher.rb:397
$ echo $?
1

Am I doing something wrong, and if so, what? 😄 Also, if you feel this is a JRuby/JVM issue let me know, but it's strange since it seemed like we had things working at around the time #1337 was merged...

@dekellum
Copy link
Contributor

Thanks for reporting this and the cc. I am also able to reproduce the issue with an external (systemd oriented) test harness, on branch and the versions in the Gemfile.lock shown:

https://github.com/dekellum/lionfish/blob/puma-jruby/Gemfile.lock

The issue anecdotally seems more common on jruby-9.1.16, but at least with my test harness, the problem is intermittent: only sometimes do I see the same stack trace and the error code 1 (which in my case, gets logged as a failure in systemd.)

I can reproduce the issue on jruby 9.1.14 and 9.1.12 as well (though it seems less frequent). My test harness is with puma 3.10.0, so this isn't a very recent puma regression. I'm not sure how or if I missed this issue in my prior testing around #1337. I guess another possible change is JDK version? I see you are using Java 9. I'm now testing on a recent version of Open JDK 8:

openjdk version "1.8.0_162"
OpenJDK Runtime Environment (build 1.8.0_162-b12)
OpenJDK 64-Bit Server VM (build 25.162-b12, mixed mode)

If you want to pursue this, I would suggest attempting to reduce it to a simple non-puma test ruby script that exits by explicitly raised SignalException:

raise SignalException, "SIGTERM"

If you can get that to fail in the same way on jruby, then you could report the issue there.

@dekellum
Copy link
Contributor

Sadly, proper exit code isn't tested on jruby, presumably due to lack of fork support. Couldn't this be done instead with a thread in the jruby case? cc: @shayonj

https://github.com/puma/puma/blame/c87bbffdb037e379844fae76a775f15e6c273110/test/test_integration.rb#L242

@perlun
Copy link
Contributor Author

perlun commented Apr 11, 2018

The issue anecdotally seems more common on jruby-9.1.16, but at least with my test harness, the problem is intermittent: only sometimes do I see the same stack trace and the error code 1 (which in my case, gets logged as a failure in systemd.)

That's actually exactly what I'm seeing as well - intermittent CI failures in my specs! Interesting. With the rackup use case mentioned above though, it fails every single time for me (I tested 5 times now, and the exit code is 1 every single time.)

I can reproduce the issue on jruby 9.1.14 and 9.1.12 as well (though it seems less frequent). My test harness is with puma 3.10.0, so this isn't a very recent puma regression. I'm not sure how or if I missed this issue in my prior testing around #1337. I guess another possible change is JDK version? I see you are using Java 9. I'm now testing on a recent version of Open JDK 8:

Actually, as mentioned above, I did test it with both OpenJDK 8 and 9. The "five times" mentioned now is with Java 8.

If you want to pursue this, I would suggest attempting to reduce it to a simple non-puma test ruby script that exits by explicitly raised SignalException:

raise SignalException, "SIGTERM"

If you can get that to fail in the same way on jruby, then you could report the issue there.

Interestingly enough, it does indeed fail for me consistently. Even with a 1.7 JRuby, it does.

If you manage to add tests for it here, that's great - I think I'll file a JRuby issue about this as well, since it does indeed seem to be an upstream issue. Thanks for your feedback thus far!

@nateberkopec
Copy link
Member

@perlun Am I correct in understanding that this was fixed by jruby/jruby#5134?

@perlun
Copy link
Contributor Author

perlun commented May 18, 2018

@nateberkopec Thanks for asking. I think it might not be, since I've unfortunately still have seen intermittent CI failures. Will try to revisit this next week to see how it behaves w/ latest JRuby and keep you posted.

@anup1710
Copy link

@perlun - Do we have any plan to fix this in near future ?

@perlun
Copy link
Contributor Author

perlun commented May 16, 2019

@anup1710 No longer using JRuby or Puma unfortunately (switched jobs) so for me personally, don't know of any plans to fix this. Is the issue causing problems for you currently?

@anup1710
Copy link

Yes, I saw this in jruby 9.2.6 with puma 3.12.0 in Windows machine.

@perlun
Copy link
Contributor Author

perlun commented May 29, 2019

@nateberkopec See above - the issue still seems to remain, unfortunately.

@nateberkopec
Copy link
Member

I think this is failing on CI here: https://travis-ci.org/puma/puma/jobs/584683964#L808

Local output for me:

$ bin/puma test/rackup/hello.ru 
Puma starting in single mode...
* Version 4.1.1 (jruby 9.2.7.0 - ruby 2.5.3), codename: Fourth and One
* Min threads: 0, max threads: 16
* Environment: development
* Listening on tcp://[::]:9292
Use Ctrl-C to stop
- Gracefully stopping, waiting for requests to finish
- Gracefully stopping, waiting for requests to finish
=== puma shutdown: 2019-09-14 12:07:16 +0200 ===
- Goodbye!
=== puma shutdown: 2019-09-14 12:07:16 +0200 ===
- Goodbye!
$ echo $?
0

@nateberkopec
Copy link
Member

A clue - that double-log at the end. It looks like something is calling stop_blocked twice. However, when I add a puts caller to stop_blocked:

def stop_blocked
      puts caller
      log "- Gracefully stopping, waiting for requests to finish"
      @control.stop(true) if @control
      @server.stop(true) if @server
    end

A race condition somewhere changes, and the shutdown text prints once and the correct exit code (143) is returned.

@nateberkopec
Copy link
Member

I'm guessing that the server status gets set to :stop, and the server run thread and the sigterm handler race each other.

@nateberkopec
Copy link
Member

Actually, I think there's a problem with the test that's unrelated, but it passes, and yet Jruby is actually exiting with status 0 for me locally.

@nateberkopec
Copy link
Member

OK, I fixed + discovered some unrelated JRuby stuff along the way, but I also discovered a 0 exit code from SIGTERM is the JVM's fault, not ours or JRuby's. See jruby/jruby#5224. Closing as invalid/wontfix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants