Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] [ml] Incorrect behavior of Topic Retention Policy #22473

Open
2 of 3 tasks
dao-jun opened this issue Apr 10, 2024 · 8 comments
Open
2 of 3 tasks

[Bug] [ml] Incorrect behavior of Topic Retention Policy #22473

dao-jun opened this issue Apr 10, 2024 · 8 comments
Assignees
Labels
type/bug The PR fixed a bug or issue reported a bug

Comments

@dao-jun
Copy link
Member

dao-jun commented Apr 10, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

master branch

Minimal reproduce step

    public void testRetentionInMB() throws Exception {
        @Cleanup("shutdown")
        ManagedLedgerFactory factory = new ManagedLedgerFactoryImpl(metadataStore, bkc);
        ManagedLedgerConfig config = new ManagedLedgerConfig();
        config.setRetentionSizeInMB(2);
        config.setRetentionTime(-1, TimeUnit.SECONDS);
        config.setMaxEntriesPerLedger(2);
        config.setMinimumRolloverTime(0, TimeUnit.SECONDS);

        @Cleanup
        ManagedLedgerImpl ml = (ManagedLedgerImpl) factory.open("retention_test_ledger", config);
        @Cleanup
        ManagedCursor c1 = ml.openCursor("c1");

        // 512K per entry, 2 entries per ledger, 1 ledger has 1MB data.
        byte[] content = new byte[1024 * 512];
        for (int i = 0; i < 21; i++) {
            ml.addEntry(content);
        }

        // should be 11 ledgers.
        assertEquals(ml.ledgers.size(), (21 / 2) + 1);
        long firstKey = ml.ledgers.firstKey();

        // ack 5 ledgers.
        c1.markDelete(new PositionImpl(firstKey + 5, 0));

        // trigger trim ledgers manually
        CompletableFuture<Void> f = new CompletableFuture<>();
        ml.trimConsumedLedgersInBackground(false, f);
        f.get();
    }
  1. For the test, we set the retention policy by size(2MB); 2 entries per ledger and each entry 512KB, each ledger is 1MB.
  2. Add 21 entries to the ManagedLedger, it will open 11 ledgers(I assume these ledger ID are [1, 11]).
  3. Set markDeletePosition to 6:0, which means [1, 5] ledgers are consumed
  4. Trigger trim ledger task manually
  5. Wait until trim task finish

What did you expect to see?

Ledgers [1, 3] deleted.

What did you see instead?

Ledgers [1, 5] deleted

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@dao-jun dao-jun added the type/bug The PR fixed a bug or issue reported a bug label Apr 10, 2024
@dao-jun dao-jun self-assigned this Apr 10, 2024
@dao-jun
Copy link
Member Author

dao-jun commented Apr 10, 2024

@lhotari PTAL

@lhotari
Copy link
Member

lhotari commented Apr 10, 2024

Good description @dao-jun . This makes sense.

We need to ask the original developer of the solution for clarification whether this is a bug or not. I guess
it depends a lot on how the "retention" settings are defined.

@dao-jun
Copy link
Member Author

dao-jun commented Apr 10, 2024

@lhotari According to https://pulsar.apache.org/docs/3.2.x/cookbooks-retention-expiry/#retention-policies and ManagedLedgerImpl#internalTrimLedgers, delete a ledger or not should be based on Total size of consumed ledgers, base on Total size of all the ledgers will lead to ManagedLedgerImpl#isLedgerRetentionOverSizeQuota always true(in some cases). This is not in line with its intended meaning

@dao-jun
Copy link
Member Author

dao-jun commented Apr 10, 2024

@lhotari
Copy link
Member

lhotari commented Apr 10, 2024

@lhotari According to https://pulsar.apache.org/docs/3.2.x/cookbooks-retention-expiry/#retention-policies and ManagedLedgerImpl#internalTrimLedgers, delete a ledger or not should be based on Total size of consumed ledgers, base on Total size of all the ledgers will lead to ManagedLedgerImpl#isLedgerRetentionOverSizeQuota always true(in some cases). This is not in line with its intended meaning

+1 @dao-jun the reported issue seems to be a bug based on this information.

The definition of the retention settings was explicitly mentioned.

The retention settings apply to all messages on topics that do not have any subscriptions, or to messages that have been acknowledged by all subscriptions. The retention policy settings do not affect unacknowledged messages on topics with subscriptions. The unacknowledged messages are controlled by the backlog quota.

@dao-jun
Copy link
Member Author

dao-jun commented Apr 10, 2024

@dao-jun
Copy link
Member Author

dao-jun commented Apr 10, 2024

In the first version doc, it says: messages in all topics in the namespace, even acknowledged messages, will be retained.

@dao-jun
Copy link
Member Author

dao-jun commented Apr 11, 2024

This is a doc issue, since https://github.com/apache/pulsar/pull/5482/files

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants