TestTimeSource/currentTime does not increase after switching to a non-delay skipping Dispatcher #4045

hfhbd · 2024-02-18T21:18:27Z

Describe the bug

The time source or the underlaying currentTime is not updated after switching the context to a non delay skipping dispatcher.

Our use-case: Ktor
We want to change Ktors public test framework testApplication from runBlocking to runTest, including its delay skipping behavior in tests, to align with the common test behavior of the well-known coroutines-test api.
But if the test contains a request to an external real server, Ktor executes the test on Dispatchers.Default/IO and uses a real-time clock to not mess up http connections, eg TLS handshakes.
After the request, the virtual time should be used again for user tests, but the virtual time isn't increased after the switch. This is confusing because in fact the time was increased.
It also does not work with custom user plugins testing with which could be part of the user tests, resulting into different times, see timeSource test as an example.

Provide a Reproducer

@Test
fun clock() = runTest {
  assertEquals(0, currentTime)
  delay(1.seconds)
  assertEquals(1000, currentTime)
        
  withContext(Dispatchers.IO) {
    delay(2.seconds)
  }
        
  assertEquals(3000, currentTime) // fails: expected 3000, actual 1000
  delay(1.seconds)
  assertEquals(4000, currentTime)
}

@Test
fun timeSource() = runTest {
  val timeSource = testScheduler.timeSource
  val now = timeSource.markNow()
  assertEquals(0.seconds, now.elapsedNow())
  delay(1.seconds)
  assertEquals(1.seconds, now.elapsedNow())

  val time = measureTime {
  withContext(Dispatchers.IO) {
    delay(2.seconds)
    }
  }
  assertTrue(time in (2.seconds)..(2.1.seconds))

  assertEquals(1.seconds + time, now.elapsedNow()) // fails: expected 3s, actual 1s
  delay(1.seconds)
  assertEquals(2.seconds + time, now.elapsedNow())
}

The text was updated successfully, but these errors were encountered:

dkhalanskyjb · 2024-02-19T10:23:55Z

It's impossible to support the test clock() in a conceptually clear manner.

Imagine we have this function:

suspend fun loadPage() = withContext(Dispatchers.IO) {
  withTimeout(2.seconds) {
    getPageFromServer()
  }
}

and this test:

fun testConnectingToNonresponsiveServer() = runTest {
  assertEquals(0, currentTime)
  assertFailsWith<TimeoutCancellationException> {
    loadPage()
  }
  assertEquals(2000, currentTime)
}

The test works. Then you change the internal implementation (that the test shouldn't know about) to something like

suspend fun loadPage() = suspendCancellableCoroutine { cont ->
  FancyJavaHttpClient
    .getPageFromServer()
    .timingOutAfter(2_000)
    .onSuccess { cont.resume(it) }
    .onError { cont.resumeWithException(it) }
    .onTimeout { cont.resumeWithException(TimeoutCancellationException() }
}

From the point of view of the test framework, these implementations should behave the same way: since you haven't mocked anything to allow the test framework to intrusively depend on your implementation, the loadPage function should be a black box that (in this case) throws a timeout cancellation exception after two seconds. The implementation didn't change, but because now the delay is performed using not delay but some other source of real-world time, the virtual time still won't advance. The observable behavior didn't change, but the test failed.

What is less impossible is to support the timeSource() test. Basically, we can measure how long it took for a process to complete and then advance the virtual time by how much has actually passed. This way, the functions that the test framework knows nothing about that use the actual real-world time will behave consistently in tests regardless of how they were implemented.

This solution is also not without its downsides and strange moments, though. For example:

fun test() = runTest {
  launch(Dispatchers.IO) {
    delay(1.seconds)
    println("Coroutine 1 completed")
  }
  launch {
    delay(100.seconds)
    println("Coroutine 2 completed")
  }
  launch {
    withContext(Dispatchers.IO) {
      delay(100.milliseconds)
    }
    delay(5.seconds)
    println("Coroutine 2 completed")
  }
}

This will print 2, 3, 1. Since the test framework has no way of knowing what's going on in coroutine 1 (as discussed above), it can't predict that its code would execute faster than that in coroutine 2 if all delays were real. Therefore, it has no choice but to skip the delay in coroutine 2: what if coroutine 1 depends on the result of coroutine 2 internally somehow?

This could be mitigated with #3179: when we introduce the option to avoid delay-skipping, you'd be able to write the timeSource test like this:

@Test
fun timeSource() = runTest {
  val timeSource = testScheduler.timeSource
  val now = timeSource.markNow()
  assertEquals(0.seconds, now.elapsedNow())
  delay(1.seconds)
  assertEquals(1.seconds, now.elapsedNow())

  val time = measureTime {
    withContext(NoDelaySkipping) {
      yourBlackBoxThatFinishesInTwoSeconds()
    }
  }
  assertTrue(time in (2.seconds)..(2.1.seconds))

  assertEquals(1.seconds + time, now.elapsedNow()) // fails: expected 3s, actual 1s
  delay(1.seconds)
  assertEquals(2.seconds + time, now.elapsedNow())
}

NoDelaySkipping would instruct the test framework to treat the delays as real-time ones during this block's execution.

hfhbd · 2024-02-21T19:10:00Z

Thank you for the detailed response and the example, I got it.

I think, the solution NoDelaySkipping sounds perfect, if it will be available in coroutines-core to opt-out in production code and not be part of coroutines-test.
Feel free to close this issue.

dkhalanskyjb · 2024-02-22T10:27:31Z

if it will be available in coroutines-core to opt-out in production code and not be part of coroutines-test.

That wasn't the plan. The context element can't magically signal all test schedulers, "Hey, a coroutine is using me at the moment! Don't advance time while it's doing it!" Coroutine context elements are completely passive entities. Therefore, the test framework must itself see the context element to make it count. If there's a withContext(NoDelaySkipping) deep inside your production code running on arbitrary dispatchers, the test framework can't see that. Black box, see?

Also, the recommended overall approach to making code testable is to support injecting new context elements.

hfhbd added the bug label Feb 18, 2024

dkhalanskyjb added test and removed bug labels Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TestTimeSource/currentTime does not increase after switching to a non-delay skipping Dispatcher #4045

TestTimeSource/currentTime does not increase after switching to a non-delay skipping Dispatcher #4045

hfhbd commented Feb 18, 2024

dkhalanskyjb commented Feb 19, 2024 •

edited

hfhbd commented Feb 21, 2024

dkhalanskyjb commented Feb 22, 2024

TestTimeSource/currentTime does not increase after switching to a non-delay skipping Dispatcher #4045

TestTimeSource/currentTime does not increase after switching to a non-delay skipping Dispatcher #4045

Comments

hfhbd commented Feb 18, 2024

dkhalanskyjb commented Feb 19, 2024 • edited

hfhbd commented Feb 21, 2024

dkhalanskyjb commented Feb 22, 2024

dkhalanskyjb commented Feb 19, 2024 •

edited