logstash v1.5.4 misleading error message when heapsize is too small #3817

TinLe · 2015-08-28T20:49:56Z

If logstash is started with too small of a heap, it barfed with misleading error message. Fortunately, I just changed one value, so I knew exactly what broke.

I had LS_HEAP_SIZE=2000m and lowered it to 1000m, when logstash is started, the log contain:

Exception in thread "|worker" Exception in thread "<elasticsearch" java.lang.UnsupportedOperationException
at java.lang.Thread.stop(Thread.java:869)
at org.jruby.RubyThread.exceptionRaised(RubyThread.java:1221)
at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:112)
at java.lang.Thread.run(Thread.java:745)
java.lang.UnsupportedOperationException
at java.lang.Thread.stop(Thread.java:869)
at org.jruby.RubyThread.exceptionRaised(RubyThread.java:1221)
at org.jruby.internal.runtime.RubyRunnable.run(RubyRunnable.java:112)
at java.lang.Thread.run(Thread.java:745)

Changing the LS_HEAP_SIZE to 1500m allows it to run again. So evidently, I lowered it too much.

suyograo · 2015-08-28T21:01:10Z

The error message is so bizarre. Not sure why it does not throw an OOM exception.

Related #3400

TinLe · 2015-08-28T21:09:28Z

The config is pretty simple too.

input {
elasticsearch {
hosts => [ "lva1-app12321", "lva1-app12330", "lva1-app12340", "lva1-app12344" ]
port => "9200"
index => "es1_alias"
size => 500
scroll => "1m"
scan => true
}
}

filter {
mutate {
remove_field => ["command"]
rename => ["host", "logstash_node"]
gsub => [
"http_client_ip", "-", "127.0.0.1"
]
}
date { match => [ "timestamp", "EEE MMM dd hh:mm:ss zzz YYYY", "ISO8601", "UNIX_MS" ] }

metrics {
meter => [ "type" ]
add_tag => "metric"
clear_interval => 3600 # clear metrics every 60m
flush_interval => 60 # flush metrics every 1m
}
}
output {
if "metric" in [tags] {
stdout { codec => json }
} else {
elasticsearch {
host => "localhost"
port => "9200"
protocol => "http"
index => "es1-%{+YYYY.MM.dd}"
}
}
}

suyograo · 2015-08-28T21:35:20Z

@TinLe it depends on the size of your events because the default bulk size from LS to ES is 5K! That may work when your logs are 300 bytes a pop, but will quickly blow up when its bigger. A defensive step we are taking is to reduce the default to 500 instead of 5k in the next release.

@TinLe You can try going back to 1 GB heap, but reduce the flush_size to 1K or something. See https://www.elastic.co/guide/en/logstash/current/plugins-outputs-elasticsearch.html#plugins-outputs-elasticsearch-flush_size

TinLe · 2015-08-28T22:55:33Z

@suyograo Yes, I am still getting that exception error in the log, and then it seem to stop consuming, although logstash process is still around.

I tried 1GB heap, no good. Reduced flush_size to 1K, no good. Increased heap to 2GB, still exception. Now running at 2GB heap AND reduced flush_size to 100. So far stable last hour or so.

suyograo · 2015-08-28T23:01:07Z

@TinLe interesting, whats the avg size of your events/docs?

TinLe · 2015-08-28T23:08:29Z

Approx 4KB.

suyograo · 2015-08-28T23:10:48Z

@TinLe another point that @talevy pointed out, you are using elasticsearch input -- we just fixed a bug in this plugin to reduce the memory usage.. logstash-plugins/logstash-input-elasticsearch#21

We'll have a version published soon with that patch. It should help

suyograo · 2015-08-28T23:43:03Z

@TinLe would you be willing to test this patch? logstash-input-elasticsearch v1.0.1 has been released with the memory consumption fix

bin/plugin update logstash-input-elasticsearch should do it

TinLe · 2015-08-29T00:14:20Z

Yep. That seem to have fixed it. I tested with 2GB and 1000 flush_size, works well for 10minutes. So tried again with 1.5GB and 1000 flush_size, still running. Memory usage seem to be lower too.

:-)

suyograo · 2015-08-29T00:15:31Z

yay! nice work @talevy ! 🏆

suyograo · 2015-08-29T00:34:56Z

thanks @TinLe for trying it out!

guyboertje · 2015-09-09T08:25:29Z

While the issue is solved for @TinLe, there is still the unsatisfactory error message for the conditions described above.

talevy · 2015-09-09T15:25:12Z

@guyboertje do you have any ideas on how to catch this situation within jruby and handle it appropriately? I agree, we should catch this and supply users with a better message.

suyograo · 2015-09-10T05:50:48Z

+100 on providing better feedback to users when OOM happens.

guyboertje · 2015-09-10T11:36:08Z

this stackoverflow question and this JRuby issue points to the problem.

With JRuby 1.7.22 and above this particular error should not occur.

But it raises the next question. What is the exception raised in the thread that causes it to stop?

Probably java.lang.OutOfMemoryError in that thread but which JVM section is out of memory? We will only know if we rescue and log it (will that even work if we are OOM?).

However according to this JRuby issue, with rescue => e JRuby does not rescue java.lang.OutOfMemoryError and the Java docs indicate that it should not be caught. So it propagates up to the thread.

Here in [this stackoverflow question]9http://stackoverflow.com/questions/2679330/catching-java-lang-outofmemoryerror) and this answer in particular provide more insights.

Epic research even if I say so myself!

jstangroome · 2015-10-20T02:05:43Z

@guyboertje

With JRuby 1.7.22 and above this particular error should not occur.

Are we likely to see a LS 1.5.5 release that has been updated from JRuby 1.7.20 to 1.7.22 or will we need to use LS 2.0?

guyboertje · 2015-10-20T09:34:40Z

@jstangroome - LS v1.5.4 does have JRuby 1.7.22. I was (badly) stating my surprise that the original poster got this UnsupportedOperation error in in LS 1.5.4 (1.7.22). I need to create a OOM generating input, codec, filter and output and test how we respond to these conditions.

We cannot guarantee that we will back port any improvements made in 2.0 in this regard to 1.5.5.

@ph see this line from JRuby specs
and this line and 1485.
Because the second link shows that the raise is done on the mainThread and that it could be a Ruby Exception class or a wrapped Java Error object i.e. not a subclass of Ruby Exception but a IRubyObject, maybe we should rescue Object => e

WDYT?

guyboertje · 2015-10-20T09:44:08Z

@jstangroome - correction to above comment. You correctly point out here that LS 1.5.4 has 1.7.20

suyograo · 2015-10-26T20:38:24Z

@jstangroome we will be shipping 1.5.5 with JRuby 1.7.22 and other fixes soon.

guyboertje · 2017-06-01T13:52:20Z

I think we have this covered now.

suyograo added the bug label Aug 28, 2015

suyograo assigned guyboertje Aug 28, 2015

suyograo mentioned this issue Sep 3, 2015

Increase the default heapsize for LS to 1GB #3861

Closed

suyograo mentioned this issue Sep 14, 2015

Logstash stops processing after a period #3914

Closed

guyboertje closed this as completed Jun 1, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

logstash v1.5.4 misleading error message when heapsize is too small #3817

logstash v1.5.4 misleading error message when heapsize is too small #3817

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 29, 2015

suyograo commented Aug 29, 2015

suyograo commented Aug 29, 2015

guyboertje commented Sep 9, 2015

talevy commented Sep 9, 2015

suyograo commented Sep 10, 2015

guyboertje commented Sep 10, 2015

jstangroome commented Oct 20, 2015

guyboertje commented Oct 20, 2015

guyboertje commented Oct 20, 2015

suyograo commented Oct 26, 2015

guyboertje commented Jun 1, 2017

logstash v1.5.4 misleading error message when heapsize is too small #3817

logstash v1.5.4 misleading error message when heapsize is too small #3817

Comments

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 28, 2015

suyograo commented Aug 28, 2015

suyograo commented Aug 28, 2015

TinLe commented Aug 29, 2015

suyograo commented Aug 29, 2015

suyograo commented Aug 29, 2015

guyboertje commented Sep 9, 2015

talevy commented Sep 9, 2015

suyograo commented Sep 10, 2015

guyboertje commented Sep 10, 2015

jstangroome commented Oct 20, 2015

guyboertje commented Oct 20, 2015

guyboertje commented Oct 20, 2015

suyograo commented Oct 26, 2015

guyboertje commented Jun 1, 2017