Fix hanging cluster safe query from common pool #21208

vbekiaris · 2022-04-12T14:23:37Z

When all FJP#commonPool threads are busy querying isClusterSafe
and partition assignments are not in sync (eg during initial
partition arrangement), then there is no chance for an important
callback to be executed after PartitionBackupReplicaAntiEntropyOperation
is done, resulting in neither partition replica sync nor cluster-safe
query being able to make any progress.
The fix is to use the Hazelcast internal async executor (instead of
the common pool) for the callback that processes replica antientropy
operation result.

(cherry picked from commit 434d731)
Backport of #21145 to 5.1.z

When all FJP#commonPool threads are busy querying isClusterSafe and partition assignments are not in sync (eg during initial partition arrangement), then there is no chance for an important callback to be executed after PartitionBackupReplicaAntiEntropyOperation is done, resulting in neither partition replica sync nor cluster-safe query being able to make any progress. The fix is to use the Hazelcast internal async executor (instead of the common pool) for the callback that processes replica antientropy operation result. (cherry picked from commit 434d731)

vbekiaris added Type: Defect Team: Core Backport Source: Internal PR or issue was opened by an employee Module: Partitioning labels Apr 12, 2022

vbekiaris added this to the 5.1.2 milestone Apr 12, 2022

vbekiaris requested a review from ahmetmircik April 12, 2022 14:23

ahmetmircik approved these changes Apr 12, 2022

View reviewed changes

vbekiaris mentioned this pull request Apr 12, 2022

com.hazelcast.partition.PartitionDistributionTest [HZ-1051] #19665

Closed

vbekiaris merged commit bbcb69a into hazelcast:5.1.z Apr 13, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix hanging cluster safe query from common pool #21208

Fix hanging cluster safe query from common pool #21208

vbekiaris commented Apr 12, 2022

Fix hanging cluster safe query from common pool #21208

Fix hanging cluster safe query from common pool #21208

Conversation

vbekiaris commented Apr 12, 2022