check_new_key timed out! & set_max_wait #68

Licenser · 2022-06-27T08:03:44Z

Hi,

with 0.7 serial_test started to constantly panic tests with check_new_key. This threw us off a bit as it's not documented what that means but digging through the code it like having many (and partially slow) tests using serial_test causes this as now it will panic after 60s by default. set_max_wait is a way to resolve this but its usage is not really documented and it seems the only want o reliably enforce it is to call it in every serial tests? Is there a better way? Could the timeout be disabled via a feature flag for example or passed in through another manner?

The text was updated successfully, but these errors were encountered:

palfrey · 2022-06-27T08:38:46Z

Something like that could be done. Can you link to the build logs where this fails?

Licenser · 2022-06-27T09:00:41Z

Heya,
wow thanks for the quick reply!

I can't provide a build log since in CI it's a non-issue (our suspicion is that the low core count in CI mitigates the problem by just not spawning many tests at the same time) but https://github.com/tremor-rs/tremor-runtime/tree/b5428f01408c12a551d9df7bc6c6338449f85a38 is the commit that triggers it when run with enough cores.

https://github.com/tremor-rs/tremor-runtime/tree/serial-test is the branch with added set_max_wait

The timeout seems to be an interesting edge case where more cores make the tests look slower since they started earlier which clashes with the way they wait / timeouts are implemented (from spawn not from last finished).

I'm not sure about the best solution thinking I've had three thoughts:

the timeout as an attribute like #[serial(feature, timeout="600")] that could put it right in there and would even get rid of the mutex as it could be test specific (and different between tests)
a feature flag for different timeouts (a bit silly but something like timeout-60, timeout-120, timeout-240 ... (feels odd but would be very convenient in our case, which is a bad reasoning 😂)
change the waiting form a poll loop to some kind of serialisation so waiting isn't determined on "since the start" but on "since last successful test"; I could see this being done with having a Vec where all tests register a channel and when the current test finishes it sends a message to the next test to send it into waiting mode - a lot more work but it feels like it would reflect a timeout the best - it now is the timeout for this test not timeout for all tests.

personally I love a combination of 1& 3, since 3 is a bit of work I can offer to take a stab at that and see if it's doable if you're open for a PR like that?

palfrey · 2022-06-28T14:16:06Z

I was idly considering the easy option for feature flags would just be a "disable timeouts entirely" flag. It's one of those features where provided your tests (and this library) are all working fine it's not really needed TBH.

However, yeah, the whole thing is a bit half-baked right now. I'm not fond of 2, but if you had patches for either just 1 or 1+3, or a "disable timeouts" flag, I'd be interested in those!

Licenser · 2022-06-28T18:22:12Z

👍 perfect, yea 2 is terrible, I'll stagger a few PRs (so the wonderful world of distractions don't interrupts something useful) with a FF, then attributes, and then see if chaining them is sensible to do :)

palfrey · 2022-08-06T09:41:54Z

1 is now done with #71 - is there more you need here, or can I close this out?

Licenser · 2022-08-08T09:14:05Z

I think we're good :). it would be nice to have waiting chained instead of "from start" but I've not had a good idea how to do that yet. I could put that in a new ticket to track it and close this?

palfrey · 2022-08-08T15:51:31Z

I think we're good :). it would be nice to have waiting chained instead of "from start" but I've not had a good idea how to do that yet. I could put that in a new ticket to track it and close this?

Sounds good to me.

Licenser · 2022-08-08T17:43:50Z

👍 will open a new issue for the chained timeouts

Fapdaddy · 2022-11-20T12:50:35Z

x

Licenser mentioned this issue Jun 27, 2022

Downgrade serial test again and ignore all versions of this lib tremor-rs/tremor-runtime#1785

Closed

6 tasks

palfrey self-assigned this Jun 27, 2022

Licenser closed this as completed Aug 8, 2022

palfrey mentioned this issue Aug 10, 2022

rework timeouts so they are linked to waiting time of a test not total waiting time #76

Closed

palfrey mentioned this issue Dec 10, 2022

timeout_ms does not cause test to stop #79

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

check_new_key timed out! & set_max_wait #68

check_new_key timed out! & set_max_wait #68

Licenser commented Jun 27, 2022

palfrey commented Jun 27, 2022

Licenser commented Jun 27, 2022

palfrey commented Jun 28, 2022

Licenser commented Jun 28, 2022

palfrey commented Aug 6, 2022

Licenser commented Aug 8, 2022

palfrey commented Aug 8, 2022

Licenser commented Aug 8, 2022

Fapdaddy commented Nov 20, 2022

check_new_key timed out! & set_max_wait #68

check_new_key timed out! & set_max_wait #68

Comments

Licenser commented Jun 27, 2022

palfrey commented Jun 27, 2022

Licenser commented Jun 27, 2022

palfrey commented Jun 28, 2022

Licenser commented Jun 28, 2022

palfrey commented Aug 6, 2022

Licenser commented Aug 8, 2022

palfrey commented Aug 8, 2022

Licenser commented Aug 8, 2022

Fapdaddy commented Nov 20, 2022