Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade http4s, http4s-servlet #118

Merged
merged 9 commits into from Jul 3, 2023

Conversation

sam-tombury
Copy link

Similar to #111, adding the lifecycle listener earlier to avoid hangs in the lifecycle.isStopping() branch

Sam Bryan added 2 commits June 20, 2023 12:33
Fixes cancellation to match CE3.5 semantics
@sam-tombury
Copy link
Author

@rossabaker - hopefully this is helpful, it just tweaks #111 to register the listener earlier (otherwise the isStopping() branch never fired the callback, which might have been causing the flakiness in CI). Happy to add in the listener cleanup (in case the lifecycle is re-used) as well if you'd like

@sam-tombury
Copy link
Author

Sorry - looks like there were some overly aggressive import changes which should be fixed now

@rossabaker
Copy link
Member

Thank you!!! I'll try to run this a few times tonight, since it was an intermittent failure, and I could only manifest it on CI.

I think it would be ideal to clean up the listener, but we didn't before and never had a reported leak, and at this point, I just want to get a cats-effect-3.5-compilant release out ASAP.

@rossabaker
Copy link
Member

Nuts. Still got failures in the 2.13 branches. The Scala 3.2 failure is because we need a Scala 3.3 upgrade. I'll merge that in here.

@sam-tombury
Copy link
Author

Looking at this again, I can't see how the test failures observed here would be caused here (rather than in http4s-servlet). The changes here only impact the cancellation of the server resource acquisition/finalisation, which shouldn't be impacting the async timeouts.

There are a couple of places where I think this might be creeping in, but I don't understand why this test is passing whilst this is consistently failing. I'll see if I can open a PR there to try and solve those

@sam-tombury
Copy link
Author

Did another dig through - the test failure observed here is consistent with what would happen if the service handler wasn't being effectively canceled (and thus backpressuring) in the race here.
Clutching at straws, but 8432dad tries explicitly switching to the new (cancelable) version of IO.never in case there's any weird caching causing problems in CI

@rossabaker
Copy link
Member

Yeah, we started pulling in a copy of AsyncHttpServlet on the debug branch. It's been frustrating, because it doesn't fail locally, doesn't fail anything on http4s-servlet, but fails intermittently here. Pulling in whatever servlet code here to shadow the jar could help find it assuming the problem is in the other module. Once identified, we can go back and fix it there.

@sam-tombury
Copy link
Author

How important is it to set the timeout on the async context here? Trying a bunch of things, it appears replacing that with standard CE timeoutTo (runs here, commit here if you can see them) solves the flakiness.

I suspect it might have something to do with the Jetty version (since everything seems stable on Jetty 10) - my best guess is it might be to do with this (possibly with this follow-up which appears to only be in Jetty 10)

@rossabaker
Copy link
Member

Nice sleuthing. In theory, it's better to call the servlet's async handler, because other listeners could be added, perhaps in a filter or an extension to the servlet. In practice, I bet nobody is or ever has. I'll stare at those Jetty diffs a bit, but I think your alternate solution may end up being the winner here.

@rossabaker
Copy link
Member

On the debug2 branch, based on this code, I do see a println in the AsyncTimeoutHandler, but the failure correlates consistently with the presence of this exception:

java.lang.IllegalStateException: s=HANDLING rs=EXPIRING os=COMPLETED is=IDLE awp=false se=false i=false al=2
	at org.eclipse.jetty.server.HttpChannelState.complete(HttpChannelState.java:711)
	at org.eclipse.jetty.server.AsyncContextState.complete(AsyncContextState.java:72)
	at org.http4s.servlet.AsyncHttp4sServlet.$anonfun$service$4(servlet.scala:73)
	at delay @ org.http4s.servlet.AsyncHttp4sServlet.$anonfun$service$3(servlet.scala:73)

@rossabaker
Copy link
Member

If we don't do anything with the async context ourselves, we get a default timeout of 30 seconds rather than reading the timeout as we always have. Just because we don't install a listener doesn't mean the clock isn't ticking. To go with the approach in @sgjbryan's commit, which I like, we'd have to set the timeout to 0.

f06f455 (when it publishes) attempts to stick to the servlet mechanism for a timeout, but renders the timeout response within the listener. It's not as simple, but might be better integrated with weird servlet shit people might do.

@rossabaker
Copy link
Member

Builds have passed twice. Restarting now. If they're still green after this comment, I'm ready to declare victory. We'll get one more chance when we un-RC the servlet version.

@sam-tombury
Copy link
Author

Grand, thanks so much for looking into this! http4s/http4s-servlet#221 looks good to me, and running the request completion in the timeout listener seems to be the correct thing to do.
I tried getting the test flake to reproduce locally with evalOn in various places, but (rather frustratingly) I didn't have any luck.

@rossabaker
Copy link
Member

This is working on the released versions. I'm going to release it. Thanks for your help @sgjbryan!

@rossabaker rossabaker merged commit 1e0092f into http4s:series/0.23 Jul 3, 2023
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants