New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Received a Maximum active streams violated for this endpoint.
from client side
#3858
Comments
Reported by @techno 🙇♂️ |
Leaving notes on recent findings (I haven't forgotten about this issue 😄 )
Unfortunately, I'm still trying to find a way to consistently reproduce this behavior in unit/integration tests at the moment. Will post more updates if I find anything |
It seems like this was fixed with netty/netty#11644 and included in |
Additional note: Rather than trying to fix the issue for users, maybe we can start with providing metrics s.t. users can understand why the exception is occurring. Possibly we may be able to record metrics on Another alternative may be to add a callback s.t. users can be notified whenever HTTP/2 settings are changed. (i.e. something like |
(Update) We received a reproducible code via a private message. I'm currently guessing that the timing for decrementing current streams isn't synchronized between e.g.
Will post more details after analyzing |
Demonstration..: It turns out that I was wrong about the cause in the above comment. It's not related to decorator or 2 different clients with the same host. The exception occurs when requests are made in a callback of another request like...: // exception
client.get("/resource").aggregate().thenRun(() -> concurrentReqToSameHost());
// but exception is not thrown if `concurrentReqToSameHost()` is called in async callback;
client.get("/resource").aggregate().thenRunAsync(() -> concurrentReqToSameHost(), ctx.eventLoop()); More on this in the demo code. |
Thanks for the detailed analysis! I think I understand why the exception occurs for the first case consistently. The callback in So I believe the flow is as follows:
I've also verified that the test code passes for the branch #3908 |
Good to hear that, thanks for the nice work!! Just 1 more question..:
|
If this bug is blocking development, then I think the above workaround is fine (assuming that the additional cost for rescheduling the task via I would also like to note though that the I'm unsure how |
Thanks again. I am thinking about giving a different |
**Background** Currently, armeria maintains a state `HttpResponseDecoder#unfinishedResponses` to check how many in-flight requests are being processed for a connection. Armeria uses this value to check if all connections occupy too many concurrent streams, and creates a new connection if necessary. On the other hand, netty maintains it's own state to check how many in-flight requests are being processed for a connection. (`DefaultHttp2Connection.DefaultEndpoint#numActiveStreams`) Netty checks this value before creating a stream, and throws a `Http2Exception$StreamException` if `MAX_CONCURRENT_STREAMS` is unavailable. **Problem Statement** Currently, when a `WriteTimeoutException` is triggered, armeria decrements `unfinishedResponses` and removes the response. (A `WriteTimeoutException` is thrown when a request header isn't written within a predefined `writeTimeoutMillis`) However, netty may not be aware that armeria has failed the response. Consequently, netty's `numActiveStreams` is greater than armeria's `unfinishedResponses`. This may cause a violation of `MAX_CONCURRENT_STREAMS` for additional requests on the connection. **Motivation** Netty always calls `Http2ResponseDecoder.onStreamClosed` before decrementing `numActiveStreams`. If we want `numActiveStreams` to be in sync with `unfinishedResponses`, I propose that we modify the timing of decrementing `unfinishedResponses` to `Http2ResponseDecoder.onStreamClosed`. In detail, when a `WriteTimeoutException` is scheduled https://github.com/line/armeria/blob/117a21e17ec9e30b0c3c2d74d16fdde3cab62434/core/src/main/java/com/linecorp/armeria/client/HttpRequestSubscriber.java#L171-L173 the response is closed. https://github.com/line/armeria/blob/117a21e17ec9e30b0c3c2d74d16fdde3cab62434/core/src/main/java/com/linecorp/armeria/client/HttpRequestSubscriber.java#L318 Consequently, after the stream processes the `close` event, `whenComplete` is triggered. https://github.com/line/armeria/blob/117a21e17ec9e30b0c3c2d74d16fdde3cab62434/core/src/main/java/com/linecorp/armeria/client/Http2ResponseDecoder.java#L83-L90 And the response is removed (and `unfinishedResponses` is decremented) https://github.com/line/armeria/blob/117a21e17ec9e30b0c3c2d74d16fdde3cab62434/core/src/main/java/com/linecorp/armeria/client/Http2ResponseDecoder.java#L101 However, as far as netty is concerned, the request may have been written and may still be processing. **Misc** Reproduced `maxConcurrentStreams` when `WriteTimeoutException` occurs at 225a684 **Modifications** - Remove the `removeResponse` call from `Http2ResponseDecoder. onWrapperCompleted`, and rely on `onStreamClosed` to remove the response/decrement `unfinishedResponses` - When receiving callbacks for `onHeadersRead`, `onDataRead`, `onRstStreamRead`, also check if `resWrapper` had been closed. This preserves behavior since `res` was previously removed on `WriteTimeoutException`, resulting in `res == null`. *Update* I realized that if we simply don't process values when headers/data/rst are received, then we might not send a `GoAway` and close the connection when `disconnectWhenFinished = true` due to df43379. I've verified this behavior from test cases added in 8018da1 I've modified further such that: - Only remove responses when `onStreamClosed` is called. - Remove calls to `channel().close();` if `shouldSendGoAway()` is true for `onDataRead`, `onHeadersRead` since `onStreamClosed` will handle this instead. - Remove `onStreamClosed` to try to close the `ResponseWrapper` only if the underlying `delegate` is open. d1183d8 There is a slight change of behavior, where a `GoAway` may be triggered from `onRstStream` as well. Let me know if this change shouldn't be made 🙏 Result: - Closes #3858
armeria version: 1.8.0
verified client-side isn't changing
MAX_CONCURRENT_STREAMS
The text was updated successfully, but these errors were encountered: