Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempt to deflake TestFnCacheSanity #10250

Merged
merged 1 commit into from
Feb 15, 2022
Merged

Attempt to deflake TestFnCacheSanity #10250

merged 1 commit into from
Feb 15, 2022

Conversation

zmb3
Copy link
Collaborator

@zmb3 zmb3 commented Feb 9, 2022

Convert approxReads to an integer (by truncating) before comparing
to actual reads.

This should prevent failures where due to our approximation, we estimate
a fractional number of reads that exceed our tolerance of 1.

Sample error: Max difference between 10.461059975 and 9 allowed is 1, but difference was 1.4610599749999995

Updates #9492

@espadolini
Copy link
Contributor

Isn't this effectively just bumping the tolerance to 2?

@zmb3
Copy link
Collaborator Author

zmb3 commented Feb 9, 2022

Isn't this effectively just bumping the tolerance to 2?

In a way, yes. I've never seen it fail with anything larger than 1.4-1.5, so I think it's not so much increasing the tolerance but making sure our "approximation" is a whole number.

We could even round instead of truncate to may be a bit more accurate, but this isn't an exact science - it's an approximation which is bound to be flaky.

@espadolini
Copy link
Contributor

We could just increase the delta in InDelta so we'd still keep the info with full precision if it exceeds that; and we can bump the tolerance to 1.5 or 1.75 without requiring integer steps that way.

@rosstimothy
Copy link
Contributor

One thing to note about the sudden flakiness of this test is that I changed the FnCache to use a clockwork.Clock in 5279edf instead of calling time.Now directly as part of #9958

@jakule
Copy link
Contributor

jakule commented Feb 10, 2022

I agree with @espadolini on that. I think that increasing the timeout is more explicit than casting to int.

We could even round instead of truncate to may be a bit more accurate, but this isn't an exact science - it's an approximation which is bound to be flaky.

Maybe I misunderstood your comment, but casting or rounding doesn't really change the value. Casting "removes" the floating part from the number, but that doesn't change the nature of the problem. Too big value will fail the test and that value is related to execution time.

If we want to make that test more reliable, I think we should leverage the fact that now we can use clockwork.Clock to change time and check if cache behave as expected instead of stress testing the code.

@zmb3
Copy link
Collaborator Author

zmb3 commented Feb 11, 2022

My thought here was that the test is not flaky because it is sometimes too slow, but because sometimes a rounding error on our calculation for "expected number of reads" pushes us slightly over the tolerance of 1, and there's no such thing as a fractional read.

In any case, updated the tolerance to 2.

Copy link
Contributor

@espadolini espadolini left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been running 6 instances of the lib/cache tests for about 15 minutes together with adding t.Parallel() to every test, and I've had a single failure because of a 2.2; plenty of failures when the threshold was 1

@zmb3
Copy link
Collaborator Author

zmb3 commented Feb 15, 2022

@rosstimothy can you have a look?

Increase tolerance on expected reads.

This should prevent failures where due to our approximation, we estimate
a fractional number of reads that exceed our tolerance of 1.

Sample error: Max difference between 10.461059975 and 9 allowed is 1, but difference was 1.4610599749999995

Updates #9492
@zmb3 zmb3 merged commit 5f6cc76 into master Feb 15, 2022
@zmb3 zmb3 deleted the zmb3/deflake-fn-cache branch February 15, 2022 19:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants