-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EventEngine Test Suite: Timers #27496
Conversation
Watch out for backwards references here... there are build systems that will likely choke on this. We may be better off with the following layout: eventengine-tests:
specific-event-engine-test:
|
Thanks for the heads up about linker backreference issues. I don't think it's an issue in this case, since implementation-specific test suite targets will always have a one-way dependency on the |
std::atomic<int> call_count{0}; | ||
std::atomic<int> failed_timer_count{0}; | ||
absl::BitGen bitgen; | ||
for (int i = 0; i < thread_count; ++i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/timer_count/thread_count through here...
We probably want 100 threads creating 100 timers, for 10000 timers scheduled total. (i.e. 100 threads each run this loop).
This will help catch any threading issues with scheduling new timers too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the 10k RunAt
calls (split across 100 threads). As a result, I decided to delete the SimpleEventEngine example implementation rather than improve it, because TSAN was not able to deal with 10k threads in that naive implementation. The LibuvEventEngine PR will exercise this suite, and at the very least, the SimpleEventEngine implementation was helpful to verify the suite up to this point.
The simple impl is not thread safe, so needs to be improved or deleted.
This reverts commit eabd318.
++count; | ||
cv_.Signal(); | ||
}); | ||
engine->RunAt(absl::Now(), [&]() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a source of flakiness: if we get pre-empted on line 88 for 1 second or more then this test will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noted on line 76. I considered removing this test, since it does require some timeout hackery which is inherently brittle, and could take a lot of time if it's too conservatively configured. We're already testing the timer's scheduling accuracy in StressTestTimersNotCalledBeforeScheduled
, and ordering is not guaranteed if both timers come due at the same time (presuming some delay in clock evaluation). I'm thinking again maybe we remove this. WDYT?
ASSERT_FALSE(signaled_); | ||
} | ||
|
||
TEST_F(EventEngineTimerTest, TimersRespectScheduleOrdering) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would make a very good stress test too: have 100 threads do this for 100 different vectors (and 100 different mutexes), assert all vectors are ordered at the end.
(this could be later work)
ASSERT_TRUE(engine->Cancel(handle)); | ||
} | ||
|
||
TEST_F(EventEngineTimerTest, CancelledCallbackIsNotExecuted) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also make an excellent stress test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100 threads schedule infinite future callbacks, and then in phase 2 100 different threads cancel them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(again, future work suggestion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some future work suggestions... I expect we should do something about TimersRespectScheduleOrdering potential flakiness, but I'm not sure it should block submission either.
…ess test" This reverts commit af13a2a.
(import notes) All tests had previously passed except for the ruby tests, which are failing due to an RVM bug. #27538 |
A reusable test suite for EventEngine implementations.
To exercise a custom EventEngine, simply link against
:event_engine_test_suite
and provide a testing
main
function that sets a custom EventEngine factory:See the
:simple_event_engine_test
blaze target for an example.@Vignesh2208 @dennycd