Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check_new_key timed out! & set_max_wait #68

Closed
Licenser opened this issue Jun 27, 2022 · 9 comments
Closed

check_new_key timed out! & set_max_wait #68

Licenser opened this issue Jun 27, 2022 · 9 comments
Assignees

Comments

@Licenser
Copy link
Contributor

Hi,

with 0.7 serial_test started to constantly panic tests with check_new_key. This threw us off a bit as it's not documented what that means but digging through the code it like having many (and partially slow) tests using serial_test causes this as now it will panic after 60s by default. set_max_wait is a way to resolve this but its usage is not really documented and it seems the only want o reliably enforce it is to call it in every serial tests? Is there a better way? Could the timeout be disabled via a feature flag for example or passed in through another manner?

@palfrey
Copy link
Owner

palfrey commented Jun 27, 2022

Something like that could be done. Can you link to the build logs where this fails?

@palfrey palfrey self-assigned this Jun 27, 2022
@Licenser
Copy link
Contributor Author

Heya,
wow thanks for the quick reply!

I can't provide a build log since in CI it's a non-issue (our suspicion is that the low core count in CI mitigates the problem by just not spawning many tests at the same time) but https://github.com/tremor-rs/tremor-runtime/tree/b5428f01408c12a551d9df7bc6c6338449f85a38 is the commit that triggers it when run with enough cores.

https://github.com/tremor-rs/tremor-runtime/tree/serial-test is the branch with added set_max_wait

The timeout seems to be an interesting edge case where more cores make the tests look slower since they started earlier which clashes with the way they wait / timeouts are implemented (from spawn not from last finished).

I'm not sure about the best solution thinking I've had three thoughts:

  1. the timeout as an attribute like #[serial(feature, timeout="600")] that could put it right in there and would even get rid of the mutex as it could be test specific (and different between tests)
  2. a feature flag for different timeouts (a bit silly but something like timeout-60, timeout-120, timeout-240 ... (feels odd but would be very convenient in our case, which is a bad reasoning 😂)
  3. change the waiting form a poll loop to some kind of serialisation so waiting isn't determined on "since the start" but on "since last successful test"; I could see this being done with having a Vec where all tests register a channel and when the current test finishes it sends a message to the next test to send it into waiting mode - a lot more work but it feels like it would reflect a timeout the best - it now is the timeout for this test not timeout for all tests.

personally I love a combination of 1& 3, since 3 is a bit of work I can offer to take a stab at that and see if it's doable if you're open for a PR like that?

@palfrey
Copy link
Owner

palfrey commented Jun 28, 2022

I was idly considering the easy option for feature flags would just be a "disable timeouts entirely" flag. It's one of those features where provided your tests (and this library) are all working fine it's not really needed TBH.

However, yeah, the whole thing is a bit half-baked right now. I'm not fond of 2, but if you had patches for either just 1 or 1+3, or a "disable timeouts" flag, I'd be interested in those!

@Licenser
Copy link
Contributor Author

👍 perfect, yea 2 is terrible, I'll stagger a few PRs (so the wonderful world of distractions don't interrupts something useful) with a FF, then attributes, and then see if chaining them is sensible to do :)

@palfrey
Copy link
Owner

palfrey commented Aug 6, 2022

1 is now done with #71 - is there more you need here, or can I close this out?

@Licenser
Copy link
Contributor Author

Licenser commented Aug 8, 2022

I think we're good :). it would be nice to have waiting chained instead of "from start" but I've not had a good idea how to do that yet. I could put that in a new ticket to track it and close this?

@palfrey
Copy link
Owner

palfrey commented Aug 8, 2022

I think we're good :). it would be nice to have waiting chained instead of "from start" but I've not had a good idea how to do that yet. I could put that in a new ticket to track it and close this?

Sounds good to me.

@Licenser
Copy link
Contributor Author

Licenser commented Aug 8, 2022

👍 will open a new issue for the chained timeouts

@Fapdaddy
Copy link

x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants