Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coroutine 1.6.0 ANR on Android 11 #3113

Closed
Tolriq opened this issue Dec 29, 2021 · 5 comments
Closed

Coroutine 1.6.0 ANR on Android 11 #3113

Tolriq opened this issue Dec 29, 2021 · 5 comments

Comments

@Tolriq
Copy link

Tolriq commented Dec 29, 2021

Took me a while to figure out but it seems the changes to delay to use MainDispatcher have a side effect on Android 11 (And only Android 11).

I have yet to build a repro but posting before in case there's something obvious.

In an activity in onCreate I have a simple runBlocking (so on main thread ui) for a call that is nearly instant.

That call use a workerPool on dispatchers.IO and at some point have a simple :

private suspend fun isLoaded(): Boolean {
        return loaded ?: run {
            do {
                delay(1)
            } while (loaded == null)
            loaded
        } ?: false
    }

This works well on first start, but if the app is killed by OS or you swipe the app from the drawer (but don't use the kill) then further start will ANR with below trace.

ANR:

 at sun.misc.Unsafe.park (Native method)
  at java.util.concurrent.locks.LockSupport.parkNanos (LockSupport.java:230)
  at kotlinx.coroutines.BlockingCoroutine.joinBlocking (BlockingCoroutine.java:88)
  at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking (BuildersKt__Builders.kt:59)
  at kotlinx.coroutines.BuildersKt.runBlocking (Builders.kt:1)
  at kotlinx.coroutines.BuildersKt__BuildersKt.runBlocking$default (BuildersKt__Builders.kt:38)
  at kotlinx.coroutines.BuildersKt.runBlocking$default (Builders.kt:1)
  at xxxxx.onCreate (xxxx.kt:127)
  at android.app.Activity.performCreate (Activity.java:8215)
  at android.app.Activity.performCreate (Activity.java:8199)
  at android.app.Instrumentation.callActivityOnCreate (Instrumentation.java:1309)
  at android.app.ActivityThread.performLaunchActivity (ActivityThread.java:3824)
  at android.app.ActivityThread.handleLaunchActivity (ActivityThread.java:4027)
  at android.app.servertransaction.LaunchActivityItem.execute (LaunchActivityItem.java:85)
  at android.app.servertransaction.TransactionExecutor.executeCallbacks (TransactionExecutor.java:135)
  at android.app.servertransaction.TransactionExecutor.execute (TransactionExecutor.java:95)
  at android.app.ActivityThread$H.handleMessage (ActivityThread.java:2336)
  at android.os.Handler.dispatchMessage (Handler.java:106)
  at android.os.Looper.loop (Looper.java:247)
  at android.app.ActivityThread.main (ActivityThread.java:8676)
  at java.lang.reflect.Method.invoke (Native method)
  at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:602)
  at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:1130)

Since it's very specific to a version of Android I'm not sure there's something to be done at coroutines levels, and I'll try to increase the delay to see, but reporting in case the delay(1) with runBlocking on mainthread can trigger issues in other unseen cases.

@Tolriq
Copy link
Author

Tolriq commented Dec 29, 2021

Ok so after some tests, I don't know why my original code with worker pool only ANR on Android 11.

But following code will cause issues on Android.

In an activity onCreate() do a simple:

        runBlocking {
            withContext(Dispatchers.IO) {
                var i = 0
                do {
                    delay(1)
                } while (i++ < 15)
            }
      }

And the app will ANR as the delay never complete. This is probably logic as the main dispatcher is used for the delay but the main dispatcher is runblocking waiting too.

This sounds like a major change that can impact many as it's easy to miss this side effect from just reading the changelog.

Don't know if there's something to detect in Coroutines side to avoid this lock. Ping @qwwdfsad

Calling System.setProperty("kotlinx.coroutines.main.delay", "false") early in app init does workaround the issue.

@JustinBis
Copy link

Nice find. It makes sense that your code example would cause a deadlock if delay is waiting on the main thread:

runBlocking { // Blocks the main thread
  withContext(Dispatchers.IO) { // won't return until the inner work is done
    delay(1) // will deadlock if using the main thread, since the main thread is blocked by this work
  }
}

I'm not sure if this is a bug in the coroutines library as much as it is yet another reason to not use runBlocking in production code.

qwwdfsad added a commit that referenced this issue Jan 11, 2022
The approach from 1.6.0 has proven itself as unstable and multiple hard-to-understand bugs have been reported:

* JavaFx timer doesn't really work outside the main thread
* The frequent initialization pattern "runBlocking { doSomethingThatMayCallDelay() }" used on the main thread during startup now silently deadlocks
* The latter issue was reported both by Android and internal JB Compose users
* The provided workaround with system property completely switches off the desired behaviour that e.g. Compose may rely on, potentially introducing new sources of invalid behaviour

The original benefits does not outweigh these pitfalls, so the decision is to revert this changes in the minor release

Fixes #3113
Fixes #3106
yorickhenning pushed a commit to yorickhenning/kotlinx.coroutines that referenced this issue Jan 28, 2022
…tlin#3131)

The approach from 1.6.0 has proven itself as unstable and multiple hard-to-understand bugs have been reported:

* JavaFx timer doesn't really work outside the main thread
* The frequent initialization pattern "runBlocking { doSomethingThatMayCallDelay() }" used on the main thread during startup now silently deadlocks
* The latter issue was reported both by Android and internal JB Compose users
* The provided workaround with system property completely switches off the desired behaviour that e.g. Compose may rely on, potentially introducing new sources of invalid behaviour

The original benefits does not outweigh these pitfalls, so the decision is to revert this changes in the minor release

Fixes Kotlin#3113
Fixes Kotlin#3106
@raghav2945
Copy link

We [Audiomack] just changed the method we initialized some of the native classes, and we began utilizing corountine. However, rather than lowering the amount of ANR, we see new ANR with the call of coroutines, which is peculiar to Android 11. Does it sound comparable to what you're pointing out here? Code for referencing.

init {
        CoroutineScope(Dispatchers.Main).launch {
            initiateCastContext()
        }
        CoroutineScope(Dispatchers.IO).launch {
            initExoPlayer()
        }
    }

@raghav2945
Copy link

@qwwdfsad, Is this problem resolved? Alternatively, we have a workaround. Please share any information that will assist us in controlling the number of ANR.

@qwwdfsad
Copy link
Member

qwwdfsad commented Apr 4, 2022

Should be fixed in 1.6.1

dee-tree pushed a commit to dee-tree/kotlinx.coroutines that referenced this issue Jul 21, 2022
…tlin#3131)

The approach from 1.6.0 has proven itself as unstable and multiple hard-to-understand bugs have been reported:

* JavaFx timer doesn't really work outside the main thread
* The frequent initialization pattern "runBlocking { doSomethingThatMayCallDelay() }" used on the main thread during startup now silently deadlocks
* The latter issue was reported both by Android and internal JB Compose users
* The provided workaround with system property completely switches off the desired behaviour that e.g. Compose may rely on, potentially introducing new sources of invalid behaviour

The original benefits does not outweigh these pitfalls, so the decision is to revert this changes in the minor release

Fixes Kotlin#3113
Fixes Kotlin#3106
pablobaxter pushed a commit to pablobaxter/kotlinx.coroutines that referenced this issue Sep 14, 2022
…tlin#3131)

The approach from 1.6.0 has proven itself as unstable and multiple hard-to-understand bugs have been reported:

* JavaFx timer doesn't really work outside the main thread
* The frequent initialization pattern "runBlocking { doSomethingThatMayCallDelay() }" used on the main thread during startup now silently deadlocks
* The latter issue was reported both by Android and internal JB Compose users
* The provided workaround with system property completely switches off the desired behaviour that e.g. Compose may rely on, potentially introducing new sources of invalid behaviour

The original benefits does not outweigh these pitfalls, so the decision is to revert this changes in the minor release

Fixes Kotlin#3113
Fixes Kotlin#3106
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants