Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race between ResendRequest and SequenceReset when using async SessionProxy #503

Open
pcdv opened this issue Mar 14, 2024 · 3 comments
Open

Comments

@pcdv
Copy link
Contributor

pcdv commented Mar 14, 2024

Consider the following scenario:

  • There are two FIX counterparties: ARTIO and EXCHANGE
  • After a disconnection, ARTIO has lastReceivedSeqNo = 6 and EXCHANGE has lastReceivedSeqNo = 4
  • For some reason, the last message sent by both parties was not received on the other end, so ARTIO has lastSentSeqNo = 5 and EXCHANGE has lastSentSeqNo = 7 (for example, because the connection closed before the Logout was received)
  • During next connection, both parties detect a gap of one message and send a ResendRequest: [7-0] from ARTIO, and [5-0] from EXCHANGE

After the second connection ARTIO sent messages in the wrong order:

  • Logon with MsgSeqNum = 6
  • SequenceReset with MsgSeqNum = 5 NewSeqNo = 8
  • ResendRequest with MsgSeqNum = 7

The sequence would be correct (IMHO) if the ResendRequest was sent before the SequenceReset, or if there was NewSeqNo = 7 in the SequenceReset. This sequence causes a disconnection because EXCHANGE detects a "MsgSeqNum too low" issue.

The problem seems related to the fact that the SequenceReset is sent directly by the Replayer when it is done replaying, while the ResendRequest is directly sent by the library when it detects the high MsgSeqNum in Logon message.

I should add that I use a SessionProxy to route all outbound messages through a cluster, and this is probably increasing the probability of such a bug as it delays the sending of the ResendRequest. NB: it seems that SessionProxy#sendSequenceReset() is never called, so I don't have the opportunity to override the value of NewSeqNo to a correct value. But SessionProxy#sendResendRequest() is called as expected.

I'm able to reproduce the bug easily using my custom setup/code. I will try to reproduce it using pure artio dependencies.

pcdv added a commit to pcdv/artio that referenced this issue Mar 15, 2024
The test also tries to reproduce an inversion of SendSequenceReset and ResendRequest
messages, but it is not reproduced (it happens only when a SessionProxy causes the
ResendRequest to be sent asynchronously).
@pcdv
Copy link
Contributor Author

pcdv commented Mar 15, 2024

I added a system test:

  • it shows that sendSequenceReset() does not go through SessionProxy
  • the message inversion is not reproduced when not using SessionProxy

I guess the fix would be to modify the replaying code so that it sends a message to the library to instruct it to send the SequenceReset message, but I have no idea where to do this :-/

EDIT: after another look, the problem is probably that the saveValidResendRequest() is directly called by the session when receiving the ResendRequest. I will try to make it go through a new method in SessionProxy, so we can ensure that the replaying is triggered only after we have fully sent our own ResendRequest.

@pcdv pcdv changed the title Race between sending ResendRequest and SequenceReset Race between ResendRequest and SequenceReset when using async SessionProxy Apr 3, 2024
@pcdv
Copy link
Contributor Author

pcdv commented Apr 4, 2024

I pushed a possible solution in master...pcdv:artio:master

I reused the existing ResendRequestController interface and added a delay() method in ResendRequestResponse so the Session does not immediately execute the ResendRequest.

It is then possible to call method Session#executeResendRequest(int, uk.co.real_logic.artio.util.AsciiBuffer, int, int, int) asynchronously. This method has been extracted from current code, whose default behavior hasn't changed.

@pcdv
Copy link
Contributor Author

pcdv commented Apr 5, 2024

I submitted PR #505. Finally, I didn't modify the ResendRequestController (except its javadoc) but added a method in SessionProxy, and a public method in Session that can be invoked by the proxy when ready to process the ResendRequest.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant