Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NPE during StackTraceRecovery exception creation #3031

Closed
gildor opened this issue Nov 18, 2021 · 4 comments
Closed

NPE during StackTraceRecovery exception creation #3031

gildor opened this issue Nov 18, 2021 · 4 comments

Comments

@gildor
Copy link
Contributor

gildor commented Nov 18, 2021

We got crashes on prod caused by StackTraceRecoveryKt. Not very common (a few cases per day).
Don't have any reproduction steps.

Fatal Exception: kotlinx.coroutines.CoroutinesInternalError: Fatal exception in coroutines machinery for CancellableContinuation(DispatchedContinuation[Dispatchers.IO, Continuation at kotlinx.coroutines.flow.SharedFlowImpl.collect(SharedFlow.kt:348)@37836a93]){Cancelled}@2e1f5ad0. Please read KDoc to 'handleFatalException' method and report this incident to maintainers
       at kotlinx.coroutines.DispatchedTask.handleFatalException(DispatchedTask.kt:144)
       at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:115)
       at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
Caused by java.lang.NullPointerException: Attempt to invoke virtual method 'java.lang.Object java.lang.StackTraceElement[].clone()' on a null object reference
       at kotlinx.coroutines.internal.StackTraceRecoveryKt.createFinalException(StackTraceRecovery.kt:107)
       at kotlinx.coroutines.internal.StackTraceRecoveryKt.recoverFromStackFrame(StackTraceRecovery.kt:78)
       at kotlinx.coroutines.internal.StackTraceRecoveryKt.access$recoverFromStackFrame(StackTraceRecovery.kt:1)
       at kotlinx.coroutines.CancellableContinuationImpl.getExceptionalResult$kotlinx_coroutines_core(CancellableContinuationImpl.kt:636)
       at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:91)
       at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:571)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:750)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:678)
       at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:665)
@qwwdfsad
Copy link
Member

Hi, could you please elaborate on what devices this crash reproduces?

From what I can see from the stacktrace, it looks like an Android issue: there are no nulls on the code path of createFinalException and NPE is triggered by the internals of Throwable.getStackTrace.

A similar issue was faced in akaita/RxJava2Debug#2, where with the same symptoms it was concluded that it's an Android problem.
We've applied (#1866) the known workaround to all our exceptions that do not fill the stacktrace, but we cannot do so for the exceptions out of our direct control.

I suspect you have an exception class in your app (or in any of the transitive dependencies) that has fillInStacktrace overridden to return this without a call to super, and such exception crashes the app on Android < 6 when it's passed through stacktrace recovery machinery.

The very best we can do is to catch NPE and bail-out here, but I'm not sure whether it's a workaround worth applying to the library, as it can hide other problems

@gildor
Copy link
Contributor Author

gildor commented Nov 22, 2021

@qwwdfsad It reproducible on Android 11, I will check our logs to give more details about devices

@gildor
Copy link
Contributor Author

gildor commented Dec 3, 2021

Sorry, you right, it crashes only on Android 5 or 6, I confused it with another issue.

I suspect you have an exception class in your app (or in any of the transitive dependencies) that has fillInStacktrace overridden to return this without a call to super, and such exception crashes the app on Android < 6 when it's passed through stacktrace recovery machinery.

Looks that we don't have such code, but looks that it exists in RxJava
. Not sure that it causes the issue though, just found it

The very best we can do is to catch NPE and bail-out here, but I'm not sure whether it's a workaround worth applying to the library, as it can hide other problems

I think it's a reasonable solution at least for such a feature as StackTraceRecovery, fail of stack trace recovery is IMO better than crash

@qwwdfsad
Copy link
Member

qwwdfsad commented Dec 6, 2021

I would suggest contributing the fix to RxJava itself.

I've investigated whether we can easily work around this problem and apparently we cannot -- we either should pay an additional getStackTrace call on a path of exception copying (that implies an allocation + copy of the array with an average size of hundred frames) or to wrap each call to stackTrace try {} catch {}, expect it to throw NPE and have a separate semantically different code path for that. I would like to avoid any of these solutions as long as a much easier fix exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants