Backport of #5967 - Allow PersistentShardCoordinator to tolerate duplicate ShardHomeAllocated messages #5970

Arkatufus · 2022-05-26T20:42:10Z

Backport of #5967

…erate duplicate ShardHomeAllocated messages

ismaelhamed · 2022-05-27T05:55:22Z

src/contrib/cluster/Akka.Cluster.Sharding/ShardCoordinator.cs

+
+                                // per https://github.com/akkadotnet/akka.net/issues/5604
+                                // we're going to allow new value to overwrite previous
+                                var newRegions = Regions;


A couple questions: 1) Is this safe all the times?; 2) Will there be a way to disable this new behavior?

Not sure - the alternative is defaulting the current behavior, which is that the ShardCoordinator blows up and the cluster needs to be restarted and all of the ShardCoordinator data deleted. Given that, I think this probably is safer than the current defaults. The worst case scenario I can imagine here is that the Shard was actually allocated to two different nodes, but that would if that were the case then the sharding system would already be in very bad shape (i.e. violating its consistency rules) and this would allow the newest home for the shard to supersede the old one, which would stop receiving message traffic. In the logs where I saw this occuring it was duplicate records for the same Shard in the same ShardRegion IActorRef - so this change would just make the recovery idempotent.

We could add it, but I don't think it should be necessary.

Aaronontheweb · 2022-05-31T19:48:40Z

@ismaelhamed I think I can make this change safer by narrowing the conditions during which we de-duplicate ShardHomeAllocated messages - which is we don't throw when the exact data is already inside the PersistentShardCoordinator.State since that's a no-op. That should open this system up to less unpredictable behavior since the shard locations are identical.

Arkatufus · 2022-05-31T23:54:18Z

Could be superseeded by #5976

Backport of akkadotnet#5967 - Allow PersistentShardCoordinator to tol…

fb3eb71

…erate duplicate ShardHomeAllocated messages

ismaelhamed reviewed May 27, 2022

View reviewed changes

This was referenced May 27, 2022

Allow PersistentShardCoordinator to tolerate duplicate ShardHomeAllocated messages #5967

Merged

Akka.Cluster.Sharding: make PersistentShardCoordinator a tolerant reader #5604

Closed

Aaronontheweb mentioned this pull request May 31, 2022

Fix sharding tolerant reader #5976

Merged

4 tasks

Aaronontheweb closed this Jun 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Backport of #5967 - Allow PersistentShardCoordinator to tolerate duplicate ShardHomeAllocated messages #5970

Backport of #5967 - Allow PersistentShardCoordinator to tolerate duplicate ShardHomeAllocated messages #5970

Arkatufus commented May 26, 2022

ismaelhamed May 27, 2022

Aaronontheweb May 27, 2022

Aaronontheweb commented May 31, 2022

Arkatufus commented May 31, 2022

Backport of #5967 - Allow PersistentShardCoordinator to tolerate duplicate ShardHomeAllocated messages #5970

Backport of #5967 - Allow PersistentShardCoordinator to tolerate duplicate ShardHomeAllocated messages #5970

Conversation

Arkatufus commented May 26, 2022

ismaelhamed May 27, 2022

Choose a reason for hiding this comment

Aaronontheweb May 27, 2022

Choose a reason for hiding this comment

Aaronontheweb commented May 31, 2022

Arkatufus commented May 31, 2022