Skip to content
This repository has been archived by the owner on Apr 1, 2024. It is now read-only.

ISSUE-20150: [Bug] [Txn] The committed transaction may rollback to OPEN status. #5730

Open
1 of 2 tasks
sijie opened this issue Apr 20, 2023 · 0 comments
Open
1 of 2 tasks
Labels

Comments

@sijie
Copy link
Member

sijie commented Apr 20, 2023

Original Issue: apache#20150


Search before asking

  • I searched in the issues and found nothing similar.

Version

master.

Minimal reproduce step

just restart brokers.

What did you expect to see?

Committed transactions can not be rollback to previous status.

What did you see instead?

Some transactions are committed before broker's restart, but become open status after the broker is restarted.
I add some logs in broker as follows:

  • log found from broker.out.1, which is the log logged before restarting broker.
2023-04-19T12:18:51,396+0800 [pulsar-io-36-8] DEBUG org.apache.pulsar.broker.service.ServerCnx - Receive add published partition to txn request 2376042138963580900 from /164.90.89.87:33256 with txnId (11,155293), topic: [persistent://test/tb1/testTxn10-partition-7]
2023-04-19T12:18:51,400+0800 [pulsar-io-36-8] DEBUG org.apache.pulsar.broker.service.ServerCnx - Receive add published partition to txn request 2376042138963580901 from /164.90.89.87:33256 with txnId (11,155293), topic: [persistent://test/tb1/testTxn10-partition-8]
2023-04-19T12:18:51,408+0800 [pulsar-io-36-1] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Append buffer to txn (11,155293)
2023-04-19T12:18:51,408+0800 [pulsar-io-36-37] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Append buffer to txn (11,155293)
2023-04-19T12:18:51,415+0800 [pulsar-io-36-8] DEBUG org.apache.pulsar.broker.service.ServerCnx - Receive end txn request 2376042138963580907 to transaction meta store TransactionCoordinatorID(id=11) for txnId:(11,155293).
2023-04-19T12:18:51,449+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-7] endTxnOnTopic txnId: [(11,155293)], txnAction: [0]
2023-04-19T12:18:51,449+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-8] endTxnOnTopic txnId: [(11,155293)], txnAction: [0]
2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-7] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [0]
2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) commit on topic persistent://test/tb1/testTxn10-partition-7.
2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-8] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [0]
2023-04-19T12:18:51,449+0800 [pulsar-io-36-32] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) commit on topic persistent://test/tb1/testTxn10-partition-8.
2023-04-19T12:18:51,508+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.transaction.coordinator.impl.MLTransactionMetadataStore - TxnID : (11,155293) update txn status to COMMITTED
  • log found from broker.out, which is the log logged after restarting broker.
2023-04-19T12:19:26,414+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-7] endTxnOnTopic txnId: [(11,155293)], txnAction: [1]
2023-04-19T12:19:26,415+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TransactionBufferHandlerImpl - [persistent://test/tb1/testTxn10-partition-8] endTxnOnTopic txnId: [(11,155293)], txnAction: [1]
2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-7] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [1]
2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) abort on topic persistent://test/tb1/testTxn10-partition-7.
2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.service.ServerCnx - [persistent://test/tb1/testTxn10-partition-8] handleEndTxnOnPartition txnId: [(11,155293)], txnAction: [1]
2023-04-19T12:19:26,415+0800 [pulsar-io-36-54] DEBUG org.apache.pulsar.broker.transaction.buffer.impl.TopicTransactionBuffer - Transaction (11,155293) abort on topic persistent://test/tb1/testTxn10-partition-8.
2023-04-19T12:19:26,425+0800 [BookKeeperClientWorker-OrderedExecutor-6-0] DEBUG org.apache.pulsar.transaction.coordinator.impl.MLTransactionMetadataStore - TxnID : (11,155293) update txn status to ABORTED

and i found some logs in TransactionRecoverTrackerImpl.

2023-04-19T12:19:24,745+0800 [transaction_coordinator_TransactionCoordinatorID(id=11)_thread_factory-89-1] DEBUG org.apache.pulsar.broker.transaction.recover.TransactionRecoverTrackerImpl - Append open transaction to timeout tracker, tcId: 11, openTransactions: {155294=1681877941414, 155293=1681877941393}

Based on logs above, we can see that the transaction (11,155293) is rollback to open status, and is aborted due to transaction timeout machanism.
Important clue: Only the TC log batch write feature is enabled will trigger this problem.

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@sijie sijie added the type/bug label Apr 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

1 participant