Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure done before checking success to help troubleshoot timeout issues #4833

Merged
merged 3 commits into from Oct 13, 2022

Conversation

breedx-splk
Copy link
Contributor

I think this can help with #4624.

It's a little delicate here, but in the case of a "bad" error code that we don't expect a retry for, we try and verify that it both failed and that the attempt was just 1. In the flaky example run linked in #4624, the attempt count was 0, which strongly suggests a race condition. That is, I think we were checking the count before the operation has completed.

I think what happened is that the .isSuccess() call fails both when the underlying call fails but also when the call has not completed. By putting an explicit .isDone() check after the .join() we can then tell if/when this happens again and then bump up the wait time. I suspect we just hit a cloud glitch and got stalled/lags, as happens sometimes. With this change, we should know.

@breedx-splk breedx-splk requested a review from a team as a code owner October 10, 2022 20:46
@codecov
Copy link

codecov bot commented Oct 10, 2022

Codecov Report

Base: 90.79% // Head: 90.79% // No change to project coverage 👍

Coverage data is based on head (f40f0b6) compared to base (d1a17d7).
Patch has no changes to coverable lines.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #4833   +/-   ##
=========================================
  Coverage     90.79%   90.79%           
  Complexity     4844     4844           
=========================================
  Files           555      555           
  Lines         14438    14438           
  Branches       1405     1405           
=========================================
  Hits          13109    13109           
  Misses          910      910           
  Partials        419      419           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jack-berg jack-berg merged commit 3f4698a into open-telemetry:main Oct 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants