Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StackTraceSimplifier.simplify cause the JVM hung while "Connection reset by peer" #2735

Closed
zaykl opened this issue Aug 5, 2014 · 1 comment
Assignees
Milestone

Comments

@zaykl
Copy link

zaykl commented Aug 5, 2014

I found our jvm process hung for few seconds while we met the "Connection reset by peer" error.

and I check the jstack log, There is 28 threads were blocking in e.getStackTrace().
See as below:

"New I/O worker #263" prio=10 tid=0x00002aaab07aa000 nid=0x6f8c runnable [0x0000000054550000]
  java.lang.Thread.State: RUNNABLE 
    at java.lang.Throwable.getStackTraceElement(Native Method)
    at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
    - locked <0x000000078982b018> (a java.io.IOException)
    at java.lang.Throwable.getStackTrace(Throwable.java:582)
    at org.jboss.netty.util.internal.StackTraceSimplifier.simplify(StackTraceSimplifier.java:56)
    at org.jboss.netty.channel.DefaultExceptionEvent.<init>(DefaultExceptionEvent.java:42)
    at org.jboss.netty.channel.Channels.fireExceptionCaught(Channels.java:525)
    at org.jboss.netty.channel.socket.nio.NioWorker.read(NioWorker.java:74)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.process(AbstractNioWorker.java:108)
    at org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:318)
    at org.jboss.netty.channel.socket.nio.AbstractNioWorker.run(AbstractNioWorker.java:89)
    at org.jboss.netty.channel.socket.nio.NioWorker.run(NioWorker.java:178)
    at org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108)
    at org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
    at java.lang.Thread.run(Thread.java:662)

Event the state is Runnable, but is blocking when I check with gdb

public final class StackTraceSimplifier
 public static void simplify(Throwable e)
 {
        ...
        StackTraceElement[] trace = e.getStackTrace();
        ...
}

I don't think this is a good idea to filter the stack trace. While the network is unstable, there will be so many threads throwing DefaultExceptionEvent. At the same time these threads calling getStackTrace method, this will cause jvm process hung out. As we know, getStackTrace will take much resource.

public DefaultExceptionEvent(Channel channel, Throwable cause)
 {
    if (channel == null) {
       throw new NullPointerException("channel");
     }
     if (cause == null) {
       throw new NullPointerException("cause");
     }
     this.channel = channel;
     this.cause = cause;
     StackTraceSimplifier.simplify(cause);
}

I suggest if we can remove this code StackTraceSimplifier.simplify(cause), and this is not usefull.

See the gdb stack:

Thread 337 (Thread 28540):
#0  0x0000003f25a0aee9 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
No symbol table info available.
#1  0x00002b1dccf9c1fe in os::PlatformEvent::park() () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#2  0x00002b1dccf8c0a2 in ObjectMonitor::EnterI(Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#3  0x00002b1dccf8bb32 in ObjectMonitor::enter(Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#4  0x00002b1dcd071f4b in ObjectSynchronizer::slow_enter(Handle, BasicLock*, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#5  0x00002b1dcd071e44 in ObjectSynchronizer::fast_enter(Handle, BasicLock*, bool, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#6  0x00002b1dccd54d00 in instanceRefKlass::acquire_pending_list_lock(BasicLock*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#7  0x00002b1dcd0eb953 in VM_GC_Operation::doit_prologue() () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#8  0x00002b1dcd0f950b in VMThread::execute(VM_Operation*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#9  0x00002b1dccfafcd5 in ParallelScavengeHeap::permanent_mem_allocate(unsigned long) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#10 0x00002b1dccc381f7 in CollectedHeap::common_permanent_mem_allocate_noinit(unsigned long, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#11 0x00002b1dcd0bf072 in typeArrayKlass::allocate_permanent(int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#12 0x00002b1dccf900b0 in oopFactory::new_permanent_charArray(int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#13 0x00002b1dccd6bf1b in java_lang_String::basic_create(int, bool, bool, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#14 0x00002b1dccd6bff1 in java_lang_String::basic_create_from_unicode(unsigned short*, int, bool, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#15 0x00002b1dccd6c131 in java_lang_String::create_tenured_from_unicode(unsigned short*, int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#16 0x00002b1dcd0716e2 in StringTable::basic_add(int, Handle, unsigned short*, int, unsigned int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#17 0x00002b1dcd071911 in StringTable::intern(Handle, unsigned short*, int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#18 0x00002b1dcd071976 in StringTable::intern(symbolOopDesc*, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#19 0x00002b1dccd72993 in java_lang_StackTraceElement::create(methodHandle, int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#20 0x00002b1dccd726cf in java_lang_Throwable::get_stack_trace_element(oopDesc*, int, Thread*) () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#21 0x00002b1dccdd7da4 in JVM_GetStackTraceElement () from /home/jws/local/jdk1.6.0_31/jre/lib/amd64/server/libjvm.so
No symbol table info available.
#22 0x00002aaaaabe24f9 in Java_java_lang_Throwable_getStackTraceElement () from /home/jws/local/jdk1.6.
@trustin trustin added this to the 3.9.3.Final milestone Aug 6, 2014
@trustin trustin self-assigned this Aug 6, 2014
@trustin
Copy link
Member

trustin commented Aug 6, 2014

I agree it can increase the load when so many failures are going on. Also, I sometimes find some people are confused by the simplified stack trace.

trustin added a commit that referenced this issue Aug 6, 2014
Related issue: #2735

Motivation:

When an application is under load and experiencing a lot of failure, the
instantiation of DefaultExceptionEvent spends noticeable amount of time
because of StackTraceSimplifier.

Also, StackTraceSimplifier makes people sometimes confused because it
hides the execution path partially.

Modifications:

Remove the use of StackTraceSimplifier

Result:

JVM spends much less time at Throwable.getStackTrace()
@trustin trustin closed this as completed Aug 6, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants