Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

issue# 307 fix: Failure detection by maximum retry fail doesnt take the exact value set in exchange.max-retry-count config parameter #308

Closed
wants to merge 1 commit into from

Conversation

ahanapradhan
Copy link
Contributor

@ahanapradhan ahanapradhan commented Mar 7, 2022

What type of PR is this?

/kind bug

What does this PR do / why do we need it:

for max-retry based failure detection mechanism, exchange.max-retry-count value is used to retry failed task.
Once that number of times retry fails, query is failed.

The issue is: the number of retry happens doesn't match the configured value exchange.max-retry-count.
If exchange.max-retry-count=20 is set, retry happens for 21, 23, 24.. for random number of times which is some value close to 20, but it never is 20.
When exchange.max-retry-count is not set, default value 10 is considered. But retry happens for 15 (or so) times.

Cause:
failure count is modified using two synchronized methods --> Backoff.failure() and Backoff.maxTried().
Two threads can parallely use these two methods.
Unless read/write of failure count is not made synchronized, the number cannot match the exact expected number in presence of multiple threads.
This fix use synchronized getter and setter methods to read/update failure count value.

Which issue(s) this PR fixes:

Fixes #307
https://gitee.com/openlookeng/hetu-core/issues/I4WGE1

Special notes for your reviewers:

@ahanapradhan ahanapradhan changed the title issue# 307 fix [WIP] issue# 307 fix Mar 9, 2022
@ahanapradhan ahanapradhan changed the title [WIP] issue# 307 fix [WIP] issue# 307 fix: Failure detection by maximum retry fail doesnt take the exact value set in exchange.max-retry-count config parameter Mar 9, 2022
@ahanapradhan ahanapradhan changed the title [WIP] issue# 307 fix: Failure detection by maximum retry fail doesnt take the exact value set in exchange.max-retry-count config parameter issue# 307 fix: Failure detection by maximum retry fail doesnt take the exact value set in exchange.max-retry-count config parameter Mar 10, 2022
@Nitin-Kashyap
Copy link
Contributor

lgtm

…alue set in exchange.max-retry-count config parameter
@sraghunandan
Copy link

/sync

1 similar comment
@sraghunandan
Copy link

/sync

@it-is-a-robot
Copy link
Contributor

@sraghunandan: This pr has been synchronized to the Gitee Repository

In response to this:

/sync

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the opensourceways/test-infra repository.

@Nitin-Kashyap
Copy link
Contributor

lgtm

@sraghunandan
Copy link

/lgtm

@it-is-a-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 473599daf87aad3e140f4236fe8d1cad7af5ce98

@sraghunandan
Copy link

/approve

@it-is-a-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahanapradhan, sraghunandan

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

maximum retry number not matching with exchange.max-retry-count value
4 participants