ARTEMIS-2716 Implements pluggable Quorum Vote #3555

franz1981 · 2021-04-26T08:23:59Z

https://issues.apache.org/jira/browse/ARTEMIS-2716

michaelpearce-gain · 2021-04-26T08:45:28Z

artemis-distribution/pom.xml

@@ -183,6 +183,73 @@
           <artifactId>tomcat-servlet-api</artifactId>
       </dependency>

+      <!-- for artemis-quorum-ri -->


wheres the ZK alternative option dependencies?

eagle eyes :D adding them now, together with features

I still dont spot them. Unresolving. I could be being blind

see https://github.com/apache/activemq-artemis/pull/3555/files#r625567290 and https://github.com/apache/activemq-artemis/pull/3555/files#r625528887

michaelandrepearce · 2021-05-04T05:29:47Z

artemis-distribution/src/main/assembly/dep.xml

@@ -106,6 +110,27 @@
            <include>com.sun.xml.bind:jaxb-core</include>
            <include>jakarta.activation:jakarta.activation-api</include>
            <include>jakarta.security.auth.message:jakarta.security.auth.message-api</include>
+            <!-- quorum -->


Where are the zk deps here too?

https://github.com/apache/activemq-artemis/pull/3555/files#r625528887

franz1981 · 2021-05-04T06:10:14Z

artemis-distribution/src/main/assembly/dep.xml

+
+            <include>org.apache.curator:curator-recipes</include>
+            <include>org.apache.curator:curator-client</include>
+            <include>org.apache.curator:curator-framework</include>


@michaelandrepearce these are the ZK deps

Now out of office eh, still need to rebase vs master and include other deps for logging and force 3.6.1 ZK client version

franz1981 · 2021-05-04T07:33:22Z

pom.xml

+            <artifactId>classgraph</artifactId>
+            <version>4.2.3</version>
+         </dependency>
+         <!-- Curator Zookeeper RI -->


@michaelandrepearce here! I still lack logging deps and config

franz1981 · 2021-05-04T08:01:02Z

.../src/main/java/org/apache/activemq/artemis/core/server/impl/ReplicationBackupActivation.java

+      assert activeMQServer.getNodeManager().getNodeId() != null;
+      final String liveID = activeMQServer.getNodeManager().getNodeId().toString();
+      final int voteRetries = policy.getVoteRetries();
+      final long maxAttempts = voteRetries >= 0 ? (voteRetries + 1) : -1;


@michaelandrepearce
@jbertram
It seems that vote-retries can be set to be -1 (unbounded) and was used with the meaning that vote-retries == 0 is possible, but saving any vote to happen (!).
I've decided to change the semantic to better match its name ie "retries" it seems to me (I'm not native eng, so please correct me if I'm wrong eh!) that are "additional" to the first mandatory "try".

so in retry logic.

-1 normally means to me "unlimited" e.g. never stop, keep re-trying for ever.
0 means dont do any retry, e.g. fail/error on first error, and do not re-attempt.
1 means retry once (or how ever many times, per a positive value)

in regards to replication, i would typically expect.

-1 normally means wait until all members have acked.
0 means do not wait for any replica acks.
1 would wait till 1 member acked.

I see this doc related vote-retries:

vote-retries
This property controls how many times the backup broker retries the quorum vote in order to receive a majority result that allows the backup broker to start up.

The code on SharedNothingBackupQuorum::isLiveDown:

final int size = quorumSize == -1 ? quorumManager.getMaxClusterSize() : quorumSize; synchronized (voteGuard) { for (int voteAttempts = 0; voteAttempts < voteRetries && !stopped; voteAttempts++) { if (voteAttempts > 0) { try { voteGuard.wait(voteRetryWait); } catch (InterruptedException e) { //nothing to do here } } if (!quorumManager.hasLive(targetServerID, size, quorumVoteWait, TimeUnit.SECONDS)) { return true; } } }

It doesn't seem correct related the retrying semantic we've said above, but I would like to understand what it means.
QuorumManager::hasLive:

returns true if the majority of nodes decide that a live with NodeID === targetServerID is still around or if quorumVoteWait expires -> it will be re-attempted vote-retries times

returns false if the majority of nodes decide that a live with NodeID === targetServerID isn't down

Hence SharedNothingBackupQuorum::isLiveDown is going to retry vote-retries times until it get QuorumManager::hasLive === false, retrying if there is stil a live around OR there's no majority agreement in quorumVoteWait seconds.

The new implementation is doing the same, but if vote-retries == 0 it still try the first time before giving up, that seems more correct to me.

franz1981 · 2021-05-04T13:55:05Z

Sadly canAcquireLockOnMajorityRestart can still get some failures because of https://issues.apache.org/jira/browse/CURATOR-594, I'm awaiting the PR I've sent on apache/curator#383 to be merged

michaelandrepearce · 2021-06-25T18:29:59Z

Im confused here because i thought we agreed on the call that versioned journal was needed and the assumption i took there was that would then need to be implemented before we move forwards.

I dont see a need or rush to merge to master till all the bits are done from that community call.

franz1981 · 2021-06-25T18:57:05Z

During the call we have agreed that I would have raised the JIRAs for the missing features to make it more robust in front of journal misalignments scenario, if compared to che classic replication.

The already delivered values here are:

support for single pair too
many fixes on some dark corners of the original state diagram
easy migration path (configuration-wise)
pluggable quorum service

Users can already use what we have here getting same or better value then classic replication.

In the meantime we can:

collect feedbacks on it as it is
design journal versioning as an enhancement
any other improvements we need

Agree about not been in hurry, but given the magnitude of this feature I believe delivering it in chunks/iteratively is the way to go (assuming each chunk to have its value).

I was sure I have explained it during the call

@gtully any comment?

michaelandrepearce · 2021-06-25T19:32:25Z

So i thought we had a data state issue and thus a solution was a must before we went ahead, and the point of the call in the first place to agree on that solution. As such the need for the journal versioning as a solution agreed to on the call to address that. Im very cautious about us delivering this to main and to users without an automated and safe solution to the data state issue as its asking for production issues, and people shooting themselves in the foot.

Further enhancements like multi slave etc was the stuff we could implement after.

But the automated data integrity was a must.

gtully · 2021-06-28T10:41:40Z

I think what we have here is a very good line in the sand. the problem that a lock version solves for a computer can more easily be solved by a capable human. What we have here is a clear definition of the problem. This is a nice incremental improvement.

michaelandrepearce · 2021-06-28T10:56:43Z

And here lies the biggest issue, the human. This is what and why this was something to be solved, because too often than not, the human cannot make the right decision, starts accidentally the wrong node, so forth and so on.
And thus why we had the call to find the solution. Now the agreed solution from call was the journal solution proposed by youself, but we do need something. If we dont think journal solution now work, then we should probably have a follow up call. but i suspect thats not the case here.

gtully · 2021-06-28T11:08:49Z

this PR is just a core part of the solution, https://issues.apache.org/jira/browse/ARTEMIS-3340 can layer over the versioning.

michaelandrepearce · 2021-06-28T11:12:05Z

Indeed, but i think we need both parts before we goto main, id be concerned about this going into main before that as then we could accidentally release just one part and people getting themselves into an issue in prod, without both parts being done, and protecting people from themselves, e.g. even myself running in prod, i would not be confident in always making the right call and indeed would want protection automatically from running the wrong one.

I see no harm in keeping a branch with this stuff, until both done, if we want to have a formal feature branch in apache git, then lets make that, and then we can pull to main from there once all done.

gtully · 2021-06-28T13:39:08Z

@michaelandrepearce I hear you. there is more work to do for sure, but it won't be in any way easier to shoot ones self in the foot with these changes.
It is all about collaboration, it is difficult to get folks to commit to code on a PR or even a feature branch.

The reality is that If it is not on main, it does not exist.

This feature is ripe for collaboration because it pulls in a distributed lock service, getting more users of that would be great to shape the API. The journal exclusive lock is just one use case (all be it an important one).

michaelandrepearce · 2021-06-28T14:06:48Z

Right now in testing this, yes i have been building and running it, i have started the wrong node after taking it all down, and thus ended up running the older journal, right now with leader follower (mstr, slv) setups, because of the leader (mstr) concept you always know which node should be taken down last and backup first. And on restart if you do accidentally start slave you're safe, because it wont start till master is there.

If it goes to main, we're saying its ready for users, without the second part i strongly disagree i dont think it is ready for release, we can have a feature branch for collaboration, even we can produce non official binaries, but i dont think its wise to officially release it, and have users giving feedback from real life deployments where they dont fully understand what theyre buying into users != active developers, i think at this stage we're in a state that its suitable for developers yes to be running it, but they can do that from branch / overnight non release binaries.

michaelandrepearce · 2021-06-28T17:50:41Z

If it does go to main, then it should have docs written with very explicit statements (big red/orange banner like) that this is experimental not for production and work in progress.

franz1981 · 2021-06-28T18:26:31Z

If it does go to main, then it should have docs written with very explicit statements (big red/orange banner like) that this is experimental not for production and work in progress.

+100 agree
It would ease the pain of using a separate branch chasing main and would ease collaboration IMO (it's just personal).
I would like to add a comment about it on the doc + a couple of other minor changes ie I would prefix all ZK live locks with namespace/locks/ to prepare for the journal version change (that will likely use something like namespace/longs/ prefixes

@gtully
I believe we should doc somewhere the data misalignments issue of classic replication too (not just the JIRA with the fix using a journal version) and what are the expected mitigation actions to help: the pluggable version will going to see a fix for this, but for classic probably not, hence it worths to help users to be aware of it IMO.

michaelandrepearce · 2021-06-28T18:35:15Z

Def want to be able to give artemis zk a base i the tree (configurable) after all many places will share a zk with other stuff or one zk for multiple brokers

franz1981 · 2021-06-28T18:45:15Z

Yep, that's already implemented using the "namespace" parameter of the manager, but I would like to already separate locks from "others" (journal timestamps or any other primitives we will need in the future)

franz1981 · 2021-06-29T06:45:38Z

@michaelandrepearce @gtully
I have both added the ZK default locks path (but I'm playing with ZK to see if it satisfies me) and the warning for users mentioning the upstream JIRA that explain the data misalignment issue.
I'm going to add a doc JIRA to produce some meaningful mitigation actions for replication in general, given that classic replication is affected by this, but is very unlikely that's gonna be fixed.

I used this manager configuration:

               <manager>
               		<properties>
               			<property key="connect-string" value="localhost:2181" />
               			<property key="namespace" value="artemis_test" />
               		</properties> 
               </manager>

And using the ZK Cli cmd:

[zk: localhost:2181(CONNECTED) 16] ls /artemis_test/locks/f2ed3b55-d8ac-11eb-9420-8cc68169c75b/leases 
[_c_d3b7e156-7599-4125-b891-0ebca91d3e63-lease-0000000000]

Where f2ed3b55-d8ac-11eb-9420-8cc68169c75b is a broker NodeID: it seems ok to me.
If in the follow-up PRs we're gonna expose counters/mutable longs holding the journal/data versioning/timestamp with the same NodeID it would look like

/artemis_test/mutable_longs/f2ed3b55-d8ac-11eb-9420-8cc68169c75b

That looks ok to me (name mutable_longs is temporary).

gtully · 2021-07-15T14:32:24Z

I have pulled all of the above into a new PR and added in the activation sequencing. I think we can move forward on that.

#3646

franz1981 · 2021-07-27T17:26:06Z

Closing as being superseded by #3646

franz1981 marked this pull request as draft April 26, 2021 08:25

michaelpearce-gain reviewed Apr 26, 2021

View reviewed changes

franz1981 force-pushed the pluggable_quorum_voting branch 16 times, most recently from e208c31 to b231f78 Compare May 4, 2021 05:06

michaelandrepearce reviewed May 4, 2021

View reviewed changes

franz1981 commented May 4, 2021

View reviewed changes

franz1981 force-pushed the pluggable_quorum_voting branch from b231f78 to d2d53c6 Compare May 4, 2021 07:31

franz1981 commented May 4, 2021

View reviewed changes

franz1981 force-pushed the pluggable_quorum_voting branch from d2d53c6 to 80e9e85 Compare May 4, 2021 07:56

franz1981 commented May 4, 2021

View reviewed changes

franz1981 force-pushed the pluggable_quorum_voting branch 5 times, most recently from 362ef2b to a9b7c4d Compare May 20, 2021 14:21

franz1981 force-pushed the pluggable_quorum_voting branch 4 times, most recently from 09f5a22 to ed32cce Compare June 28, 2021 09:32

franz1981 added 7 commits June 29, 2021 10:08

ARTEMIS-2716 Pluggable Quorum API + Curator RI

9926061

ARTEMIS-2716 Reducing dependencies from activation types

e3d2269

ARTEMIS-2716 Adding new policies and activation

ac7adda

ARTEMIS-2716 Adding single-pair smoke-test

5a25ad2

ARTEMIS-2716 Porting replication integration tests

e4ffad7

ARTEMIS-2716 adding doc

ed45ec1

ARTEMIS-2716 adding EXPERIMENTAL hints for users

045578a

franz1981 force-pushed the pluggable_quorum_voting branch from d547781 to 045578a Compare June 29, 2021 08:09

franz1981 closed this Jul 27, 2021

franz1981 deleted the pluggable_quorum_voting branch August 6, 2021 07:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ARTEMIS-2716 Implements pluggable Quorum Vote #3555

ARTEMIS-2716 Implements pluggable Quorum Vote #3555

franz1981 commented Apr 26, 2021

michaelpearce-gain Apr 26, 2021

franz1981 Apr 26, 2021

michaelandrepearce May 4, 2021

franz1981 May 4, 2021

michaelandrepearce May 4, 2021

franz1981 May 4, 2021

franz1981 May 4, 2021

franz1981 May 4, 2021

franz1981 May 4, 2021

franz1981 May 4, 2021 •

edited

michaelpearce-gain May 4, 2021 •

edited

michaelpearce-gain May 4, 2021 •

edited

franz1981 May 4, 2021 •

edited

franz1981 commented May 4, 2021

michaelandrepearce commented Jun 25, 2021 •

edited

franz1981 commented Jun 25, 2021 •

edited

michaelandrepearce commented Jun 25, 2021 •

edited

gtully commented Jun 28, 2021

michaelandrepearce commented Jun 28, 2021

gtully commented Jun 28, 2021

michaelandrepearce commented Jun 28, 2021 •

edited

gtully commented Jun 28, 2021

michaelandrepearce commented Jun 28, 2021 •

edited

michaelandrepearce commented Jun 28, 2021 •

edited

franz1981 commented Jun 28, 2021 •

edited

michaelandrepearce commented Jun 28, 2021

franz1981 commented Jun 28, 2021

franz1981 commented Jun 29, 2021 •

edited

gtully commented Jul 15, 2021

franz1981 commented Jul 27, 2021

ARTEMIS-2716 Implements pluggable Quorum Vote #3555

ARTEMIS-2716 Implements pluggable Quorum Vote #3555

Conversation

franz1981 commented Apr 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

franz1981 May 4, 2021 • edited

Choose a reason for hiding this comment

michaelpearce-gain May 4, 2021 • edited

Choose a reason for hiding this comment

michaelpearce-gain May 4, 2021 • edited

Choose a reason for hiding this comment

franz1981 May 4, 2021 • edited

Choose a reason for hiding this comment

franz1981 commented May 4, 2021

michaelandrepearce commented Jun 25, 2021 • edited

franz1981 commented Jun 25, 2021 • edited

michaelandrepearce commented Jun 25, 2021 • edited

gtully commented Jun 28, 2021

michaelandrepearce commented Jun 28, 2021

gtully commented Jun 28, 2021

michaelandrepearce commented Jun 28, 2021 • edited

gtully commented Jun 28, 2021

michaelandrepearce commented Jun 28, 2021 • edited

michaelandrepearce commented Jun 28, 2021 • edited

franz1981 commented Jun 28, 2021 • edited

michaelandrepearce commented Jun 28, 2021

franz1981 commented Jun 28, 2021

franz1981 commented Jun 29, 2021 • edited

gtully commented Jul 15, 2021

franz1981 commented Jul 27, 2021

franz1981 May 4, 2021 •

edited

michaelpearce-gain May 4, 2021 •

edited

michaelpearce-gain May 4, 2021 •

edited

franz1981 May 4, 2021 •

edited

michaelandrepearce commented Jun 25, 2021 •

edited

franz1981 commented Jun 25, 2021 •

edited

michaelandrepearce commented Jun 25, 2021 •

edited

michaelandrepearce commented Jun 28, 2021 •

edited

michaelandrepearce commented Jun 28, 2021 •

edited

michaelandrepearce commented Jun 28, 2021 •

edited

franz1981 commented Jun 28, 2021 •

edited

franz1981 commented Jun 29, 2021 •

edited