Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Virtual thread pinning using Conscrypt with Socket APIs #1170

Open
carterkozak opened this issue Sep 29, 2023 · 5 comments
Open

Virtual thread pinning using Conscrypt with Socket APIs #1170

carterkozak opened this issue Sep 29, 2023 · 5 comments

Comments

@carterkozak
Copy link
Contributor

carterkozak commented Sep 29, 2023

Background

Conscrypt offers excellent performance, which is helpful now more than ever given the TLS performance regressions on x86_64 in JDK-18+. Unfortunately the Conscrypt Socket stream implementations synchronize over Socket I/O pinning virtual threads to carriers as described in jep-444. This prevents Conscrypt users from taking advantage of virtual threads (in fact introduces the deadlocks as all carriers become pinned).

Snippet from a unit test where enabling virtual threads resulted in a deadlock:

    java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:346)
    java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:796)
    java.base/java.net.Socket$SocketInputStream.read(Socket.java:1099)
    org.conscrypt.ConscryptEngineSocket$SSLInputStream.readFromSocket(ConscryptEngineSocket.java:920)
    org.conscrypt.ConscryptEngineSocket$SSLInputStream.processDataFromSocket(ConscryptEngineSocket.java:884)
    org.conscrypt.ConscryptEngineSocket$SSLInputStream.readUntilDataAvailable(ConscryptEngineSocket.java:799)
    org.conscrypt.ConscryptEngineSocket$SSLInputStream.read(ConscryptEngineSocket.java:772) <== monitors:1
    org.apache.hc.core5.http.impl.io.SessionInputBufferImpl.fillBuffer(SessionInputBufferImpl.java:149)
    org.apache.hc.core5.http.impl.io.SessionInputBufferImpl.readLine(SessionInputBufferImpl.java:280)
    org.apache.hc.core5.http.impl.io.ChunkedInputStream.getChunkSize(ChunkedInputStream.java:261)

Proposed Solution

We should be able to retain existing behavior by replacing Object monitor locks with ReentrantLocks, which do not pin virtual threads. Such a change would need to be performance tested, and may impact observability (e.g. deadlock detection, thread dump metadata).

Does this seem reasonable? Would you accept such a contribution if I were to implement it?

Thanks!

@prbprbprb
Copy link
Collaborator

Certainly sounds reasonable and we certainly use ReadWriteLocks for synchronisation elsewhere.

My minor concern is that I don't have a feel for the performance of a ReentrantLock (or ReadWriteLock) plus try/finally versus synchronized. But (a) seems like it's a blocker for virtual threads if deadlocks are possible and (b) a lot of the code paths are either rare (in which case we don't care) or guarding expensive method calls (in which case we won't notice).

That plus I don't think we have good benchmarks for this, and our existing benchmarks are rather bit-rotted. :/

@prbprbprb
Copy link
Collaborator

I'm still trying to understand why the stack trace in the first comment is part of a deadlock, @carterkozak

The lock in that case is the readLock field in the input stream, and one thread is holding it but is blocked in a read() system call so any other readers will be blocked, but this thread can make progress either if the read() returns (either with data or due to a timeout), or if another thread closes the underlying SSLSocket, e.g. in the case where you have a socket pool with a watchdog thread. Closing the socket shouldn't need to acquire readLock.

So I'm wondering if what you're seeing is actually a concurrency bug which gets exposed by thread pinning rather than a deadlock situation? Any chance you can get a full thread dump if it's reproducible for you?

None of which invalidates the conversation about Reentrant or ReadWrite locks over synchronized...

Also, as I'm sure you're aware, there are no guarantees about read boundaries over SSLSocket. The way it's currently implemented there is a very high probability of a 1:1 mapping between writes and subsequent reads, but even with fixed sized records there's a non-zero chance that a read might return a partial record, and if you have multiple reader threads you're going to have to deal with two threads each getting part of a record.

@carterkozak
Copy link
Contributor Author

Thanks for the replies, @prbprbprb. Using a ReentrantReadWriteLock sounds reasonable to me, I should have time to implement that change on Monday.

I'm still trying to understand why the stack trace in the first comment is part of a deadlock

It's not a classic deadlock, but rather a loom carrier thread exhaustion deadlock:
My client (using conscrypt) is sending requests to a webserver in the same JVM. Both the threads used by the client, and the server are loom virtual threads.
To simplify things, let's assume I have a single carrier thread (the platform thread which mounts and executes virtual threads):

Without Conscrypt:
1: Client (on a virtual thread) begins to send a request.
2: Client reaches a blocking I/O operation which cannot complete immediately because bytes haven't been consumed by the server, so the client virtual thread is unmounted.
3: Server (in the same jvm) virtual thread is mounted, begins to consume data from the socket until available data has been consumed. Now the server virtual thread can be unmounted, and the carrier thread can execute the client virtual thread to send more data (or signal the end of the request).

With Conscrypt:
1: Client (on a virtual thread) begins to send a request.
1a: org.conscrypt.ConscryptEngineSocket$SSLInputStream.read(ConscryptEngineSocket.java:772) <== monitors:1 pins the virtual thread to the current carrier. Within this block, the virtual thread cannot be unmounted
2: Client reaches a blocking I/O operation which cannot complete immediately because bytes haven't been consumed by the server. The thread blocks until the server has consumed data.
3: The server (in the same jvm) attempts to start virtual threads to read the request, however the single carrier thread is currently pinned to the client request, so no virtual threads can currently make progress.

This deadlock case is certainly a bit contrived (in practice it occurred within a JMH benchmark using a client and server in the same jvm). However, in cases that don't result in a deadlock, the inability to unmount and switch virtual threads on carriers has a substantial impact on application performance (I don't have the benchmark results available right now, but the difference was in the 10-20% range in synthetic testing between pinned virtual threads with an increased carrier pool size to avoid deadlocks, and virtual threads with this patch -- I haven't benchmarked to compare regular platform threads before/after this patch yet).

@carterkozak
Copy link
Contributor Author

I can put together a minimal reproducer of this case on Monday if it’s helpful

@prbprbprb
Copy link
Collaborator

I can put together a minimal reproducer of this case on Monday if it’s helpful

Nah, your explanation makes total sense, thanks! As you say, it's somewhat contrived but Conscrypt should behave better under those circumstances.

My main concern was that pinning was exposing some latent concurrency bug in Conscrypt, but it looks like it's "just" thread starvation due to suboptimal locking.

carterkozak added a commit to carterkozak/conscrypt that referenced this issue Oct 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants