Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restore shutdown sequence & offload replica sync #20883

Merged
merged 1 commit into from Mar 4, 2022

Conversation

vbekiaris
Copy link
Collaborator

Restore shutdown sequence & offload replica sync

PartitionReplicaSyncRequestOffloadable would block the priority
generic op thread while waiting for merkle tree comparison to occur,
leading to deadlocks.

NodeExtension#shutdown should be called after graceful-shutdown-aware
services are already shutdown. Otherwise persistence is shut down
before data services, resulting in exceptions during migrations

Foward-port of #20813 to main branch

Restore shutdown sequence & offload replica sync

PartitionReplicaSyncRequestOffloadable would block the priority
generic op thread while waiting for merkle tree comparison to occur,
leading to deadlocks.

NodeExtension#shutdown should be called after graceful-shutdown-aware
services are already shutdown. Otherwise persistence is shut down
before data services, resulting in exceptions during migrations
@vbekiaris vbekiaris added this to the 5.2 milestone Mar 4, 2022
@vbekiaris vbekiaris self-assigned this Mar 4, 2022
@ahmetmircik
Copy link
Member

private static final boolean ALLOW_OFFLOAD =
            Boolean.parseBoolean(System.getProperty(PARTITION_REPLICA_ALLOW_OFFLOAD, "true"));

@vbekiaris When ALLOW_OFFLOAD is false, there must be no usage of merkle tree, otherwise deadlock is inevitable.
Is my understanding correct here? If yes, in what ways we can improve user experience? Can we remove PARTITION_REPLICA_ALLOW_OFFLOAD property or can we fail fast when it is false and used with merkle tree? WDYT?

@vbekiaris
Copy link
Collaborator Author

vbekiaris commented Mar 4, 2022

Can we remove PARTITION_REPLICA_ALLOW_OFFLOAD property or can we fail fast when it is false and used with merkle tree?

When ALLOW_OFFLOAD is false, we should not use merkle trees for partition replication purposes (migrations & anti-entropy).
But merkle trees can still be useful for WAN sync, so ALLOW_OFFLOAD==false and merkle-trees enabled on some IMaps/ICaches is still a valid configuration.
I think the property is useful as a kill-switch in case we find issues and it is desirable to switch to pre-5.0 replication behaviour.

@ufukyilmaz
Copy link
Contributor

ufukyilmaz commented Mar 4, 2022

When ALLOW_OFFLOAD is false, there must be no usage of merkle tree, otherwise deadlock is inevitable.

In the nonoffloaded case, this merkle tree comparison will run on partition threads and blocking them may not be as critical as blocking priority generic/ generic threads. I think we cannot exactly say that the deadlock is inevitable in this case by only considering this issue.

In this offload enabled case, the thread which was expected to offload the partition sync task was running on priority/generic threads that is more prone to deadlock if we don't offload the main task. See that we set this offload tasks' partitionId to -1 here: https://github.com/hazelcast/hazelcast/pull/20883/files#diff-2ccd9b76b1a1019e2129104193e9b88e7cb7b239062a80587b429231b3c520c2R80-R81, it was resulting in blocking priority generic/generic threads.

Copy link
Contributor

@ufukyilmaz ufukyilmaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the backports/forwardports.

@ahmetmircik
Copy link
Member

ahmetmircik commented Mar 4, 2022

When ALLOW_OFFLOAD is false, we should not use merkle trees for partition replication purposes (migrations & anti-entropy).

What about checking ALLOW_OFFLOAD before creating migration operations to decide which kind of migration mechanism we follow? This is not to fall into this known issue unexpectedly later. So if ALLOW_OFFLOAD is false, service will not use merkle tree comparison for migrations.

@vbekiaris
Copy link
Collaborator Author

run-lab-run

@vbekiaris
Copy link
Collaborator Author

What about checking ALLOW_OFFLOAD before creating migration operations to decide which kind of migration mechanism we follow? This is not to fall into this known issue unexpectedly later.

We already control whether we instantiate the offloadable or not partition replica sync response for anti-entropy mechanism using this property

PartitionReplicaSyncRequest syncRequest = shouldOffload()
? new PartitionReplicaSyncRequestOffloadable(partitionId, namespaces, replicaIndex)
: new PartitionReplicaSyncRequest(partitionId, namespaces, replicaIndex);
.

or maybe I misunderstood your suggestion?

@ahmetmircik
Copy link
Member

ahmetmircik commented Mar 4, 2022

I meant adding a new if here: https://github.com/hazelcast/hazelcast-enterprise/blob/463f8919c379882fcd3e6578041bcd9fce1bf34e/hazelcast-enterprise/src/main/java/com/hazelcast/map/impl/EnterpriseMapMigrationAwareService.java#L85

So when ALLOW_OFFLOAD is false, we directly call super. Maybe with a log message that we don't use merkle tree.

@vbekiaris
Copy link
Collaborator Author

I see, this is already covered here because we call super if prepareReplicationOperation is running on partition thread. I think this is a stronger guarantee that you won't arrive to a potential deadlock with merkle tree comparison on partition threads because:

  • if ALLOW_OFFLOAD is false, then the anti-entropy replication op is already running on partition thread -> no deadlock is possible
  • even if ALLOW_OFFLOAD is true (default) and we somehow end up preparing the replication op on partition thread (probably due to a bug?), we will still avoid deadlock because of the existing check.

@ahmetmircik
Copy link
Member

@vbekiaris thanks for the explanation, now i see that case is already covered.

@vbekiaris vbekiaris merged commit f934d9b into hazelcast:master Mar 4, 2022
@vbekiaris
Copy link
Collaborator Author

thanks @ufukyilmaz & @ahmetmircik !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants