Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash with kotlinx.coroutines.JobCancellationException #2170

Closed
frist008 opened this issue Jul 28, 2020 · 11 comments
Closed

Crash with kotlinx.coroutines.JobCancellationException #2170

frist008 opened this issue Jul 28, 2020 · 11 comments

Comments

@frist008
Copy link

Hello.
Migrated my project from RxJava to Kotlin coroutines.
My project has up to 2000-3000 online per day and 20_000 per month.

I started getting all sorts of errors on Crashlitics with JobCancellationException.

Full log:
Fatal Exception: kotlinx.coroutines.JobCancellationException: Job was cancelled.

However, there are 30-50 other streams. I'm worried that I don't have a Stacktrace where the error occurred? and I cannot fix the problem.

Furthermore. I cannot reproduce it myself. This error occurs with real users, but not me. For real users, the error does not repeat itself for a month. The sample for all users is very small - 0.23%. And the chance of repetition is even less.

The error header can be quite different on Crashlitics:

AndroidLeakFixes.java line 400
leakcanary.AndroidLeakFixes$Companion$backgroundExecutor$1$thread$1.run

bn.java line 1
bn.a

bn.java line 1
bn.b

bf.java line 1
bf.b

rw.java line 2
rw.run

zj.java line 2
zj.a

bt.java line 1
bt.a

DpmTcmClient.java line 149
com.qti.tcmclient.DpmTcmClient$TcmReceiver.run

GlideExecutor.java line 393
com.bumptech.glide.load.engine.executor.GlideExecutor$DefaultThreadFactory$1.run

You may also have noticed that I have listed 2 sides of the library that are thrown with JobCancellationException.

It's funny, but I don't use coroutines explicitly in them. It's just magic.

implementation "org.jetbrains.kotlin:kotlin-stdlib-jdk8:1.3.72"
implementation "org.jetbrains.kotlinx:kotlinx-coroutines-core:1.3.8"
implementation "org.jetbrains.kotlinx:kotlinx-coroutines-android:1.3.8"
implementation "org.jetbrains.kotlinx:kotlinx-coroutines-play-services:1.3.8"

I am using a wrapper to run coroutines.

fun CoroutineScope.launchSafe(
    context: CoroutineContext = EmptyCoroutineContext,
    start: CoroutineStart = CoroutineStart.DEFAULT,
    launchSafe: LaunchSafe = object : LaunchSafe {
        override fun onCancellation(e: CancellationException) {
            Timber.v(e)
        }

        override fun onFailure(t: Throwable) {
            Timber.e(t)
        }
    },
    onStart: (suspend CoroutineScope.() -> Unit)? = null,
    block: suspend CoroutineScope.() -> Unit
) = launch(context, start) {
    try {
        onStart?.invoke(this)
        block()
    } catch (e: CancellationException) {
        try {
            launchSafe.onCancellation(e)
        } catch (t: Throwable) {
            Timber.e(t)
        }
    } catch (t: Throwable) {
        try {
            launchSafe.onFailure(t)
        } catch (t: Throwable) {
            Timber.e(t)
        }
    }
}

interface LaunchSafe {

    @WorkerThread
    fun onCancellation(e: CancellationException)

    @WorkerThread
    fun onFailure(t: Throwable)
}
abstract class BaseViewModel : ViewModel(), CoroutineScope, LaunchSafe {

    override var coroutineContext =
        Dispatchers.Default + SupervisorJob() + CoroutineName("ViewModel") + SafeCoroutineExceptionHandler()

    val errorLiveEvent = LiveEvent<Throwable>()

    override fun onFailure(t: Throwable) {
        errorLiveEvent.postValue(t)
    }

    override fun onCancellation(e: CancellationException) {
        // none
    }

    override fun onCleared() {
        coroutineContext.cancel()
        super.onCleared()
    }
}

@Suppress("FunctionName", "NOTHING_TO_INLINE")
inline fun SafeCoroutineExceptionHandler() = CoroutineExceptionHandler { _, throwable ->
    if (BuildConfig.DEBUG) {
        Thread.currentThread()
            .uncaughtExceptionHandler
            ?.uncaughtException(Thread.currentThread(), throwable)
    } else {
        Timber.e(throwable, "Error ExceptionHandler")
    }
}
@frist008
Copy link
Author

All logs usually contain Unsafe.park.

glide-source-thread-3
       at java.lang.Object.wait(Object.java)
       at java.lang.Thread.parkFor$(Thread.java:2161)
       at sun.misc.Unsafe.park(Unsafe.java:358)
       at java.util.concurrent.locks.LockSupport.park(LockSupport.java:190)
       at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2059)
       at java.util.concurrent.PriorityBlockingQueue.take(PriorityBlockingQueue.java:548)
       at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1087)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
       at java.lang.Thread.run(Thread.java:784)
       at com.bumptech.glide.load.engine.executor.GlideExecutor$DefaultThreadFactory$1.run(GlideExecutor.java:393)
sun.misc.Unsafe.park (Unsafe.java)
java.util.concurrent.locks.LockSupport.park (LockSupport.java:190)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await (AbstractQueuedSynchronizer.java:2067)
java.util.concurrent.PriorityBlockingQueue.take (PriorityBlockingQueue.java:548)
bn.a (bn.java:1)
bn.run (bn.java:2)

@qwwdfsad
Copy link
Member

qwwdfsad commented Aug 3, 2020

Hi,

I'm worried that I don't have a Stacktrace where the error occurred

Please try to enable debug mode: https://github.com/Kotlin/kotlinx.coroutines/blob/master/docs/debugging.md#debug-mode to have stacktraces in JobCancellationException

It's hard to tell what exactly is wrong without actual stacktraces, but as a wild guess, please ensure that you do not invoke channel.offer from non-suspending methods as it can unexpectedly throw CancellationException (#974).
Also, it is worth to add SafeCoroutineExceptionHandler to your launchSafe to handle exceptions from coroutine children as well

@torresmi
Copy link

torresmi commented Aug 3, 2020

My team encountered this as well and unfortunately, we haven't identified the root cause.

But we did make the following changes in a release and haven't seen it since. Maybe these will be helpful:

  1. Replaced our Observable.asFlow() usages to Observable.toFlowable(BackPressureStrategy.LATEST).asFlow(), like this comment from ChildCancelledException when using Rx Observable.asFlow() extension within combine() #2104.
  2. Replaced our Flow.asObservable() usages to Flow.asFlowable().toObservable()
  3. A lot of our exceptions involved OkHttp usage. So we updated from v3.14.9 to v4.7.2.
  4. Updated Coroutines from v1.3.7 to v1.3.8

Other things to note:

  • We weren't using channel.offer explicitly in our source code.
  • Crashlytics reports would often point to lines of library code that were existing issues for us before we introduced coroutines. Those issues did not have stack traces that lead back to our source code. After using coroutines more, we would get the same reports but started happening much more often and the stacktrace became Fatal Exception: kotlinx.coroutines.JobCancellationException: Job was cancelled.

@frist008
Copy link
Author

frist008 commented Aug 3, 2020

Hi,

I'm worried that I don't have a Stacktrace where the error occurred

Please try to enable debug mode: https://github.com/Kotlin/kotlinx.coroutines/blob/master/docs/debugging.md#debug-mode to have stacktraces in JobCancellationException

It's hard to tell what exactly is wrong without actual stacktraces, but as a wild guess, please ensure that you do not invoke channel.offer from non-suspending methods as it can unexpectedly throw CancellationException (#974).
Also, it is worth to add SafeCoroutineExceptionHandler to your launchSafe to handle exceptions from coroutine children as well

We weren't using channel.offer or RX explicitly in our source code.

I already add SafeCoroutineExceptionHandler in coroutineContext for BaseViewModel : ViewModel(), CoroutineScope.
If I add it again to launchSafe, won't it be a duplicate?

@frist008
Copy link
Author

frist008 commented Aug 3, 2020

I am not using experimental or deprecated functions in the project at all.

I have set in Application
System.setProperty (DEBUG_PROPERTY_NAME, DEBUG_PROPERTY_VALUE_ON)

I think it will be necessary to wait a few days. Before I get any results from users in beta testing.

@qwwdfsad
Copy link
Member

qwwdfsad commented Aug 4, 2020

Replaced our Observable.asFlow() usages to Observable.toFlowable(BackPressureStrategy.LATEST).asFlow()

Thanks, that's a great place to start looking!

I already add SafeCoroutineExceptionHandler in coroutineContext for BaseViewModel : ViewModel(), CoroutineScope.
If I add it again to launchSafe, won't it be a duplicate?

If the scope of BaseViewModel is used for launchSafe, then it's indeed not necessary

@frist008
Copy link
Author

frist008 commented Aug 14, 2020

I found a bug. It consisted in the fact that I used in one place in my code the suspend function in the catch (e: CancellationException) block.

Helped me:

class Application : Application {

    override fun onCreate() {
        super.onCreate()

        System.setProperty(DEBUG_PROPERTY_NAME, DEBUG_PROPERTY_VALUE_ON)
    }
}

I continue to observe. If no new issues are found by the end of the month, I will close the discussion.

@frist008
Copy link
Author

Perhaps this problem is somehow related to a Google bug.
https://issuetracker.google.com/issues/164317567

@frist008
Copy link
Author

After switching to Kotlin 1.4, no new error messages appeared. Thanks to all.

@mhernand40
Copy link

@frist008 have you by any chance switched back to Observable.asFlow() now that you have switched to Kotlin 1.4? How about Flow.asObservable()?

@frist008
Copy link
Author

frist008 commented Oct 9, 2020

@frist008 have you by any chance switched back to Observable.asFlow() now that you have switched to Kotlin 1.4? How about Flow.asObservable()?

We weren't using RX explicitly in our source code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants