Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential incompatibility with google/cronet-transport-for-okhttp when GQL batching is enabled #5847

Open
duzinkie opened this issue Apr 23, 2024 · 3 comments

Comments

@duzinkie
Copy link

duzinkie commented Apr 23, 2024

Version

3.7.3

Summary

I'm researching moving our GQL clients to use cronet HTTP engine (to leverage http3/quic) and since we rely heavily on OkHttp, decided to use https://github.com/google/cronet-transport-for-okhttp and it's interceptor to minimize the changes to the application I'm working on.

(apologies if the description below sounds too vague/simplistic, I do not know the domain very well and might not have the right vocabulary)

While working on this transition, I have noticed, that, when using GQL batching, along with the interceptor from the package I just mentioned the app would very often (though not always) "hang" when sending requests, which I was able to trace up to some point, but am stuck. I've managed to see that:

I am able to circumvent the issue by providing an okhttp interceptor that reads the entire request and sets the content length to the length of what it just read before handing over to cronet (it's implementation is simple so I'll skip it)

This behavior occurs far less frequent when the app is being debugged and breakpoints are present at various stages, suggesting it might be some sort of deadlock/race condition, but I'm unable to pinpoint that exactly. CronetInterceptor does call .allowDirectExecutor(); which is documented with this warning

Warning: This option makes it easy to accidentally block the network thread. It should not be used if your callbacks perform disk I/O, acquire locks, or call into other code you don't carefully control and audit.

but as mentioned before, I don't know the domain well enough to say if this is significant.

Lastly, I couldn't find any similar issues, or anything regarding usage of apollo and cronet (although I'm aware of the option of implementing HttpEngine directly), but still figured you might be interested to know that there's a potential incompatibility here. I've reported a twin issue in the https://github.com/google/cronet-transport-for-okhttp project (google/cronet-transport-for-okhttp#36).

Of course I understand it can also be the case that it's neither apollo nor cronet-transport but our own integration with those that causes the problem, or that this doesn't align with your own roadmap, feel free to resolve as you see fit.

Let me know if you'd like me to share any more information, and thank you for any assistance you can provide.

Steps to reproduce the behavior

unfortunately my project is not open source. I could attempt reproducing in a smaller project if anyone investigating this issue cannot find anything obvious.

Logs

I don't have any logs and/or stacktraces atm.

@BoD
Copy link
Contributor

BoD commented Apr 24, 2024

Thanks for reporting this!

That's interesting. If there is an attempt somewhere down the line to read the body multiple times, having a quick look back at BatchingHttpInterceptor, I don't immediately see why this would cause the hang, but not 100% sure. I am not familiar with Cronet unfortunately so I couldn't tell why this happens either. The issue being not systematic doesn't help of course :)

Maybe it would make more sense to implement HttpEngine directly without going through the OkHttp adapter? That would at least eliminate one possible source of the issue - however, I don't know how easy that would be and don't want to send you to a wild goose chase.

I was also wondering if Batching is really useful in your case? The main reason to using it is to avoid the overhead of making an HTTP call per query, but maybe when using HTTP2 or HTTP3/Quic the advantage of doing so is slim? And I'm also curious if other than the issue with batching, using Cronet works well otherwise?

In any case, if you want to pursue this, indeed a small reproducer project would definitely help.

@duzinkie
Copy link
Author

Maybe it would make more sense to implement HttpEngine directly without going through the OkHttp adapter?
I was also wondering if Batching is really useful in your case?

Indeed, we're considering both. Using okhttp for now appears to be the "path of least resistance" in our use case and I'd imagine other apps could also the same choice and potentially end up in the same spot so at least this issue can act as "heads"-up for them.

In any case, if you want to pursue this, indeed a small reproducer project would definitely help.

I'll give that a try, but it might take a while as I'm traveling soon.

@duzinkie
Copy link
Author

I'll give that a try, but it might take a while as I'm traveling soon.

I've not been able to reproduce this particular issue in an example project (suggesting maybe it's cause by something else in our networking stack). However, I was able to observe and reproduce some issue with cronet request lifecycles when using the same libraries - I've described those in #5885

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants