Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash : SIGSEGV /SEGV_MAPERR #1131

Open
ArchanaPrabhu opened this issue Apr 26, 2023 · 22 comments
Open

Crash : SIGSEGV /SEGV_MAPERR #1131

ArchanaPrabhu opened this issue Apr 26, 2023 · 22 comments

Comments

@ArchanaPrabhu
Copy link

App crashes with the below exception.
Using okhttp version : 3.12.13
Crash happening only on Android 13 devices while doing network call.

Any pointers on why this is happening only on specific devices?
Please let me know if any additional details are required to debug this further.


pid: 0, tid: 3268 >>> com.example.app <<<

backtrace:
#00 pc 0x0000000000038600 /apex/com.android.conscrypt/lib64/libssl.so (bssl::ssl_cert_dup(bssl::CERT*)+68)
#1 pc 0x000000000003f984 /apex/com.android.conscrypt/lib64/libssl.so (SSL_new+484)
#2 pc 0x000000000002212c /apex/com.android.conscrypt/lib64/libjavacrypto.so (NativeCrypto_SSL_new(_JNIEnv*, _jclass*, long, _jobject*)+24)
#3 pc 0x0000000000461554 /apex/com.android.art/lib64/libart.so (art_quick_generic_jni_trampoline+148)
#4 pc 0x0000000000209a9c /apex/com.android.art/lib64/libart.so (nterp_helper+1948)
#5 pc 0x0000000000024644 /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.NativeSsl.newInstance+12)
#6 pc 0x0000000000209334 /apex/com.android.art/lib64/libart.so (nterp_helper+52)
#7 pc 0x000000000001983c /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.ConscryptEngine.newSsl)
#8 pc 0x0000000000209334 /apex/com.android.art/lib64/libart.so (nterp_helper+52)
#9 pc 0x000000000001b0e6 /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.ConscryptEngine.+94)
#10 pc 0x000000000020a254 /apex/com.android.art/lib64/libart.so (nterp_helper+3924)
#11 pc 0x0000000000018822 /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.ConscryptEngineSocket.newEngine+54)
#12 pc 0x0000000000209334 /apex/com.android.art/lib64/libart.so (nterp_helper+52)
#13 pc 0x0000000000018d68 /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.ConscryptEngineSocket.+52)
#14 pc 0x000000000020a958 /apex/com.android.art/lib64/libart.so (nterp_helper+5720)
#15 pc 0x0000000000021814 /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.Java8EngineSocket.)
#16 pc 0x000000000020a958 /apex/com.android.art/lib64/libart.so (nterp_helper+5720)
#17 pc 0x00000000000360ec /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.Platform.createEngineSocket+16)
#18 pc 0x0000000000209334 /apex/com.android.art/lib64/libart.so (nterp_helper+52)
#19 pc 0x0000000000031c8c /apex/com.android.conscrypt/javalib/conscrypt.jar (com.android.org.conscrypt.OpenSSLSocketFactoryImpl.createSocket+84)
#20 pc 0x00000000026cc1c4 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.RealConnection.connectTls+164)
#21 pc 0x00000000026cd3f8 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.RealConnection.establishProtocol+440)
#22 pc 0x00000000026cdedc /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.RealConnection.connect+1884)
#23 pc 0x00000000025bae44 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.StreamAllocation.findConnection+1812)
#24 pc 0x00000000025bb44c /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.StreamAllocation.findHealthyConnection+92)
#25 pc 0x00000000025bbbf8 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.StreamAllocation.newStream+280)
#26 pc 0x00000000026cb940 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.connection.ConnectInterceptor.intercept+224)
#27 pc 0x00000000026d2c48 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RealInterceptorChain.proceed+1544)
#28 pc 0x00000000026d2618 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RealInterceptorChain.proceed+104)
#29 pc 0x00000000026cb03c /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.cache.CacheInterceptor.intercept+1468)
#30 pc 0x00000000026d2c48 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RealInterceptorChain.proceed+1544)
#31 pc 0x00000000026d2618 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RealInterceptorChain.proceed+104)
#32 pc 0x00000000026d0820 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.BridgeInterceptor.intercept+4288)
#33 pc 0x00000000026d2c48 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RealInterceptorChain.proceed+1544)
#34 pc 0x00000000026d4e5c /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept+700)
#35 pc 0x00000000026d2c48 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.http.RealInterceptorChain.proceed+1544)
#36 pc 0x00000000026c97b8 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.RealCall.getResponseWithInterceptorChain+3528)
#37 pc 0x00000000026c6ea0 /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.RealCall$AsyncCall.execute+128)
#38 pc 0x00000000025ad87c /data/app/~~x5omOPFUGuwkf_7eG32MIw==/com.example.app-_KOMTFChZ1oamat1_QhM_g==/oat/arm64/base.odex (okhttp3.internal.NamedRunnable.run+124)
#39 pc 0x0000000000588960 /data/misc/apexdata/com.android.art/dalvik-cache/arm64/boot.oat (java.util.concurrent.ThreadPoolExecutor.runWorker+976)
#40 pc 0x0000000000585b48 /data/misc/apexdata/com.android.art/dalvik-cache/arm64/boot.oat (java.util.concurrent.ThreadPoolExecutor$Worker.run+72)
#41 pc 0x00000000003fe840 /data/misc/apexdata/com.android.art/dalvik-cache/arm64/boot.oat (java.lang.Thread.run+80)
#42 pc 0x0000000000457b6c /apex/com.android.art/lib64/libart.so (art_quick_invoke_stub+556)
#43 pc 0x0000000000484e54 /apex/com.android.art/lib64/libart.so (art::ArtMethod::Invoke(art::Thread*, unsigned int*, unsigned int, art::JValue*, char const*)+156)
#44 pc 0x0000000000484b20 /apex/com.android.art/lib64/libart.so (art::JValue art::InvokeVirtualOrInterfaceWithJValuesart::ArtMethod*(art::ScopedObjectAccessAlreadyRunnable const&, _jobject*, art::ArtMethod*, jvalue const*)+400)
#45 pc 0x00000000005ce334 /apex/com.android.art/lib64/libart.so (art::Thread::CreateCallback(void*)+1684)
#46 pc 0x00000000000b6668 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+208)
#47 pc 0x00000000000532cc /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64)

@prbprbprb
Copy link
Collaborator

Most likely thing here, based on where and how it's crashing, is some kind of native heap corruption. Either in the app, okhttp or the version of Conscrypt shipping as a Mainline module.

That module is identical across Android 11 through 14, so if you're only seeing crashes on 13 then again it points to heap corruption as the native allocator changed between 12 and 13.

If you can consistently reproduce this, and are willing to share the code, then the best way forward is to open an Android bug at https://issuetracker.google.com/issues/new?component=190923&template=841312 and then we can try and debug it.

@sourav234698
Copy link

is this repo accepting contributions or not??

@prbprbprb
Copy link
Collaborator

is this repo accepting contributions or not??

Sure :)

@prbprbprb
Copy link
Collaborator

Closing this issue here though as it is now on the Android issue tracker.

@ArchanaPrabhu
Copy link
Author

This issue is happening both on Android 12 and 13 devices. Does not seem to be related to changes in native heap allocator.
Since this is a segmentation fault, there can be one of the 3 possibilities : The crash could happen while

  1. accessing an out-of-bound memory location
  2. accessing invalid memory
  3. writing to a read-only memory.

Do you have any idea if the library is getting into some bad state during a network call?

Where can we see the conscrypt releases? This issue started happening from March 2023. Can we debug this better?

I am not able to get a repro locally though.

@prbprbprb

@prbprbprb
Copy link
Collaborator

Since this is a segmentation fault, there can be one of the 3 possibilities : The crash could happen while
accessing an out-of-bound memory location
accessing invalid memory

Right, but the root cause of that could be any kind of heap corruption. bssl::ssl_cert_dup() is following the native pointers in various objects, so if they get corrupted then it can easily be trying to access invalid memory locations.

Typically that happens if there is a concurrency bug between threads, or a native pointer gets re-used after its memory has been freed.

The same platform version of Conscrypt runs on Android 11 through 14, so the fact you're only seeing crashes on 12 and 13 is unexpected.

The fact that it the issue started in March makes me think some other component is corrupting the heap, as we didn't ship any Conscrypt changes in February or March.

However the fact it's consistently crashing in SSL_New() makes me think it may be a Conscrypt bug after all.

Can we debug this better?

Without any kind of repro steps it's going to be very difficult.

@prbprbprb prbprbprb reopened this Jul 19, 2023
@ArchanaPrabhu
Copy link
Author

Thanks for responding.
Is there any way to change the conscrypt version or upgrade to the latest and ship it?

Is there any other way we could debug this? Any thoughts?

There is one change that is specific to Android 13. There was a change in the garbage collection algorithm. Could this be causing it?

image

But we also observed huge volumes of this crash in Android 12 as well.

@ArchanaPrabhu
Copy link
Author

@prbprbprb
One new observation is that the total duration of garbage collection - art.gc.gc-time that we fetch from the Debug.getRuntimeStat class is very high just before the crash happens.

Could this crash be a manifestation of OOMs in native heap?

@prbprbprb
Copy link
Collaborator

Thanks for the update! I believe the userfaultd GC is active on (at least some) Android 12 devices now, so what you're seeing does suggest it's related to low memory conditions and aggressive GCing.

What's interesting is that ssl_cert_dup() is consistently failing when copying a certificate from an SSLSessionContext object which might be shared across many TLS connections. And what I missed until just now is that AbstractSessionContext manages its own native pointer. The pattern we use everywhere else in Conscrypt it to wrap the pointer in a NativeRef subclass, and then when calling JNI code we pass in the NativeRef object as an Object (as well as the pointer) in order to prevent premature finalization while running in native code. In this case, it's actually the AbstractSessionContext that gets passed in, which ought to be sufficient to keep the context object alive and prevent its finalizer from running. Maybe I'm missing something there though... I shall ask some ART folk.

Another possibility is a good old-fashioned concurrency bug exacerbated by the device being slowed down by GCs. When a TLS connection is established the certificate is set up in a convoluted way... The native TLS code calls back into Java to select a certificate, and that callback calls further JNI code to set the certificate on the native SSL object with not much (if any) locking. However as far as I can see this code path never modifies the certificate data on the SSL_CTX object, so I can't really see a scenario where one thread is updating SSL_STX->cert while another is copying it. @davidben ?

@prbprbprb
Copy link
Collaborator

Chatted to ART and it doesn't seem to be related to userfaultd GC, or the stack trace would be different. But the finalizer for AbstractSessionContext is definitely suspicious.

prbprbprb added a commit to prbprbprb/conscrypt that referenced this issue Aug 4, 2023
We don't have a definitive root cause for google#1131 but it seems like either use-after-free (e.g. finalizer ordering) or concurrency issue, so:
1. Make the native pointer private and move all accesses into AbstractSessionContext
2. Zero it out on finalisation
3. Add locking. Note we only need a read lock for the sslNew() path as this is thread safe and doesn't modify the native SSL_CTX.
prbprbprb added a commit to prbprbprb/conscrypt that referenced this issue Aug 4, 2023
We don't have a definitive root cause for google#1131 but it seems like either use-after-free (e.g. finalizer ordering) or concurrency issue, so:
1. Make the native pointer private and move all accesses into AbstractSessionContext
2. Zero it out on finalisation
3. Add locking. Note we only need a read lock for the sslNew() path as this is thread safe and doesn't modify the native SSL_CTX.
prbprbprb added a commit to prbprbprb/conscrypt that referenced this issue Aug 4, 2023
We don't have a definitive root cause for google#1131 but it seems like either use-after-free (e.g. finalizer ordering) or concurrency issue, so:
1. Make the native pointer private and move all accesses into AbstractSessionContext
2. Zero it out on finalisation
3. Add locking. Note we only need a read lock for the sslNew() path as this is thread safe and doesn't modify the native SSL_CTX aside from atomic refcounts.

The above change is broadly equivalent to turning the native pointer into a NativeRef, which would mean its finalizer shouldn't run until after the AbstractSessionContext object is unreachable, but (currently) NativeRefs don't zero out the native address on finalization.
prbprbprb added a commit that referenced this issue Aug 6, 2023
We don't have a definitive root cause for #1131 but it seems like either use-after-free (e.g. finalizer ordering) or concurrency issue, so:
1. Make the native pointer private and move all accesses into AbstractSessionContext
2. Zero it out on finalisation
3. Add locking. Note we only need a read lock for the sslNew() path as this is thread safe and doesn't modify the native SSL_CTX aside from atomic refcounts.

The above change is broadly equivalent to turning the native pointer into a NativeRef, which would mean its finalizer shouldn't run until after the AbstractSessionContext object is unreachable, but (currently) NativeRefs don't zero out the native address on finalization.
@ArchanaPrabhu
Copy link
Author

ArchanaPrabhu commented Aug 7, 2023

Thanks for merging a possible fix for the above issue. @prbprbprb
When can we expect this to reflect in the OTA updates so that I can track if this change has fixed the crash?

Assuming that the crash is happening due to the concurrency issue of AbstractSessionContext, I am curious to know why this would be specific to Android 12 and 13 devices. Could they be related to userfaultd GC algo in any way or due to high memory usage?

@ArchanaPrabhu
Copy link
Author

@prbprbprb Gentle ping on the above query

@prbprbprb
Copy link
Collaborator

Oh, sorry, I missed the. previous comment!

#1154 (and also #1157 and #1164) are planned to go out in the November Mainline build, that is they'll start reaching devices at the start of November and should be fully rolled out by the end of that month. Non-Mainline devices (e.g. Android Go) won't get the fix then, but the next time their vendor sends an OTA... But the fixes apart from #1164 are already in AOSP for them and #1164 should land in AOSP today.

For the second part of your question (root cause), I'm frankly not sure because we haven't managed to reproduce the issues. However the finalizers in question all had latent bugs and so I'm moderately confident that the fixes will help. At the very least they should prevent native crashes, although it's possible that if there are other concurrency issues we missed then you may still see NullPointerExceptions.

I suspect these bugs have been causing crashes forever, just at a frequency low enough that nobody noticed and then recent ART changes (e.g. GC patterns) meant we started seeing them more often.

The long term fix here is to use Cleaners rather than finalizers to free up native resources which are less error-prone, but that isn't simple so long as we still support OpenJDK 8 and Android API levels < 33.

@ArchanaPrabhu
Copy link
Author

Since we are closely tracking this fix, could you please help me in monitoring the rollout of the November mainline build? How could we find the devices / the dates on which the mainline build is released? @prbprbprb

@prbprbprb
Copy link
Collaborator

I'm not sure there's a public source of that information but I'll try and find out. Very approximately though, the first few weeks are taken up with "canary" rollouts to detect issues, then there's a progressive rollout to 50%, 99% and the last percent only get updated towards the very end of the month.

@ArchanaPrabhu
Copy link
Author

ArchanaPrabhu commented Nov 27, 2023

Hi @prbprbprb,

Can we consider that the mainline build with the fix was merged and is available on at least the Android 13 phones? The crash does not seem to be showing a downward trend in Play console at least.

In the below article, they have mentioned only the update of 2 Mainline components -
Screenshot 2023-11-27 at 1 43 05 PM

https://source.android.com/docs/security/bulletin/2023-11-01

Does that mean conscrypt lib update was not included? Would be great if you could share any resource around mainline build release timeline/ notes.

Thanks

@prbprbprb
Copy link
Collaborator

Ah, that note is a bit confusing. You're linking the release notes for the November Security Bulletin, which goes out as an OTA update because it needs to be able to update components anywhere in the Android platform. But what the release notes are saying is that the fixes for those two CVEs are going out as part of a Mainline update, rather than with the security bulletin OTA[1]. There are no security fixed for Conscrypt in the November bulletin, so it isn't mentioned.

Meanwhile, it appears that the November Mainline train is still in its canary phase due to the US Thanksgiving holidays, which means it is on less than 2% of Mainline devices (maybe even less than that), which I wasn't expecting... It looks to me like it's supposed to ramp up to 99% by the end of this week, so if you don't hear any more from me by Friday then please ping the issue again.

[1] It's not really feasible for OTAs to update Mainline modules, or for Mainline updates to update non-Mainline components.

@ArchanaPrabhu
Copy link
Author

Hi @prbprbprb,

Thanks a lot for the detailed explanation to my queries.

Could you please confirm if the mainline build rollout is 100% now? Is there anyway we could check if devices have received this update? (Any document?)

One update: The native crash is translating to a java crash now (which we could catch with a try-catch). Please let me know if there is a fix for this that you are aware of or any possible cause.

Exception java.lang.RuntimeException: javax.net.ssl.SSLException: Invalid session context
  at com.android.org.conscrypt.ConscryptEngine.newSsl (ConscryptEngine.java:208)
  at com.android.org.conscrypt.ConscryptEngine.<init> (ConscryptEngine.java:199)
  at com.android.org.conscrypt.ConscryptEngineSocket.newEngine (ConscryptEngineSocket.java:117)
  at com.android.org.conscrypt.ConscryptEngineSocket.<init> (ConscryptEngineSocket.java:104)
  at com.android.org.conscrypt.Java8EngineSocket.<init> (Java8EngineSocket.java:62)
  at com.android.org.conscrypt.Platform.createEngineSocket (Platform.java:334)
  at com.android.org.conscrypt.OpenSSLSocketFactoryImpl.createSocket (OpenSSLSocketFactoryImpl.java:163)
  at okhttp3.internal.connection.RealConnection.connectTls (RealConnection.kt)
  at okhttp3.internal.connection.RealConnection.establishProtocol (RealConnection.kt)
  at okhttp3.internal.connection.RealConnection.connect (RealConnection.kt)
  at okhttp3.internal.connection.ExchangeFinder.findConnection (ExchangeFinder.kt)
  at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection (ExchangeFinder.kt)
  at okhttp3.internal.connection.ExchangeFinder.find (ExchangeFinder.kt)
  at okhttp3.internal.connection.RealCall.initExchange$okhttp (RealCall.kt)
  at okhttp3.internal.connection.ConnectInterceptor.intercept (ConnectInterceptor.kt)
  at okhttp3.internal.http.RealInterceptorChain.proceed (RealInterceptorChain.kt)
  at okhttp3.internal.cache.CacheInterceptor.intercept (CacheInterceptor.kt)
  at okhttp3.internal.http.RealInterceptorChain.proceed (RealInterceptorChain.kt)
  at okhttp3.internal.http.BridgeInterceptor.intercept (BridgeInterceptor.kt)
  at okhttp3.internal.http.RealInterceptorChain.proceed (RealInterceptorChain.kt)
  at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept (RetryAndFollowUpInterceptor.kt)
  at okhttp3.internal.http.RealInterceptorChain.proceed (RealInterceptorChain.kt)
  at okhttp3.internal.connection.RealCall.getResponseWithInterceptorChain$okhttp (RealCall.kt)
  at okhttp3.internal.connection.RealCall$AsyncCall.run (RealCall.kt)
  at java.util.concurrent.ThreadPoolExecutor.runWorker (ThreadPoolExecutor.java:1145)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run (ThreadPoolExecutor.java:644)
  at java.lang.Thread.run (Thread.java:1012)
Caused by javax.net.ssl.SSLException: Invalid session context
  at com.android.org.conscrypt.AbstractSessionContext.newSsl (AbstractSessionContext.java:216)
  at com.android.org.conscrypt.NativeSsl.newInstance (NativeSsl.java:80)
  at com.android.org.conscrypt.ConscryptEngine.newSsl (ConscryptEngine.java:206)

Thanks

@ArchanaPrabhu
Copy link
Author

@prbprbprb Gentle reminder on the above query.

@ArchanaPrabhu
Copy link
Author

Hi @prbprbprb Can you please help with this?

@ArchanaPrabhu
Copy link
Author

@prbprbprb Gentle reminder on the above query.

@prbprbprb
Copy link
Collaborator

@prbprbprb Gentle reminder on the above query.

Sorry, got lost in the Christmas backlog!

On the plus side, we did indeed fix the code path causing the native crashes, and now we have a Java stack trace to work with.

On the minus side this situation shouldn't be possible........The root cause exception is because a socket factory is trying create a new SSL session for a new socket but the native pointer to its ssl session context is 0.

Every SSLContext contains has a reference to a ClientSessionContext object which has a pointer to a native SSL_CTX struct. This is created from the SSLContext constructor and the native struct is created from the ClientSessionContext constructor and there is no code path which allows this to be zero without throwing an exception.

=> There is no way to create an SSLContext with a native pointer of 0

(if the code creating the SSL_CTX throws then you can have a ClientSessionContext with a 0 pointer, which will eventually get finalised but this will just be a no-op)

Since #1154 the native pointer is never shared outside the class and all accesses are synchronized.

=> There is no way for it to become 0 due to concurrency bugs

=> The only way for the native pointer to become 0 is through finalisation.

The ClientSessionContext is widely shared. It is created by the SSLContext which passes it to every SSLSocketFactory it creates and thence it gets passed to every SSLSocket inside the socket's SSLParameters.

=> As the crash happens during socket creation there should be no way the ClientSessionContext can have been finalised because the SSLContext and SSLSocketFactory still exist and have references to it

There's probably a flaw in my reasoning but I'm failing to see it. :/

The crash is too consistent for an ART bug, and so far this is the only report of it that I'm aware of... Is it possible your app is doing anything unusual with reflection around SSLContext or SSLParameters? Or catching and ignoring OOM exceptions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants