Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Broker] Fix set-publish-rate when using preciseTopicPublishRateLimiterEnable=true #10384

Conversation

lhotari
Copy link
Member

@lhotari lhotari commented Apr 26, 2021

Motivation

When using preciseTopicPublishRateLimiterEnable=true (introduced by #7078) setting for rate limiting, there are various issues:

  • updating the limits doesn't set either boundary when changing the limits from a bounded limit to unbounded.
  • each topic will create a scheduler thread for each limiter instance
  • each topic will never release the scheduler thread when the topic gets unloaded / closed
  • updating the limits didn't close the scheduler thread related to the replaced limiter instance

Modifications

  • Fix updating of the limits by cleaning up the previous limiter instances before creating new limiter instances
  • Use brokerService.pulsar().getExecutor() as the scheduler for the rate limiter instances
  • Add resource cleanup hooks for topic closing (unload)

Open issue

The existing code has a difference in passing the rateLimitFunction:

if (this.publishMaxMessageRate > 0) {
topicPublishRateLimiterOnMessage =
new RateLimiter(publishMaxMessageRate, 1, TimeUnit.SECONDS, rateLimitFunction);
}
if (this.publishMaxByteRate > 0) {
topicPublishRateLimiterOnByte = new RateLimiter(publishMaxByteRate, 1, TimeUnit.SECONDS);
}

It's passed to the topicPublishRateLimiterOnMessage, but not to topicPublishRateLimiterOnByte . It is unclear whether this is intentional.
The rateLimitFunction is () -> this.enableCnxAutoRead()

(This also raises a question whether rate limiting works consistently when multiple topics share the same connection.)

@lhotari lhotari marked this pull request as draft April 26, 2021 13:28
@lhotari
Copy link
Member Author

lhotari commented Apr 26, 2021

@aloyszhang @codelipenghui @merlimat @eolivelli Please review. I'd also like to get some help in resolving the open issue described in the PR description.

@lhotari
Copy link
Member Author

lhotari commented Apr 30, 2021

@aloyszhang @codelipenghui @merlimat @eolivelli I'd appreciate your feedback on this PR Please review.

@aloyszhang
I'd also like to get some help in resolving the open issue described in the PR description. Is it intentional that rateLimitFunction isn't called for topicPublishRateLimiterOnByte?

@eolivelli
Copy link
Contributor

@codelipenghui @merlimat @rdhabalia do you have any comment ?

@aloyszhang
Copy link
Contributor

aloyszhang commented May 17, 2021

I'd also like to get some help in resolving the open issue described in the PR description. Is it intentional that rateLimitFunction isn't called for topicPublishRateLimiterOnByte?

Sorry for the late.
If we passed rateLimitFunction to both topicPublishRateLimiterOnMessage and topicPublishRateLimiterOnByte, the () -> this.enableCnxAutoRead() will be called twice each rateTime

@lhotari
Copy link
Member Author

lhotari commented May 17, 2021

If we passed rateLimitFunction to both topicPublishRateLimiterOnMessage and topicPublishRateLimiterOnByte, the () -> this.enableCnxAutoRead() will be called twice each rateTime

@aloyszhang Is it so that it should be called in both cases, but it's some kind of optimization to skip calling it for the other case?
If a possible duplicate call is the issue, is that something that should be fixed?

@lhotari lhotari force-pushed the lh-fix-issue-when-using-preciseTopicPublishRateLimiterEnable branch from a9826bd to ac41481 Compare May 17, 2021 08:05
@aloyszhang
Copy link
Contributor

aloyszhang commented May 17, 2021

Is it so that it should be called in both cases, but it's some kind of optimization to skip calling it for the other case? If a possible duplicate call is the issue, is that something that should be fixed?

@lhotari Yes, since the rateLimitFunction is to enable channel autoread, called once has the same effect with called twice.

@lhotari
Copy link
Member Author

lhotari commented May 17, 2021

Is it so that it should be called in both cases, but it's some kind of optimization to skip calling it for the other case? If a possible duplicate call is the issue, is that something that should be fixed?

@lhotari Yes, since the rateLimitFunction is to enable channel autoread, called once has the same effect with called twice.

It seems that the rate limiting should be somehow combined when both limits publishThrottlingRateInMsg and publishThrottlingRateInByte are set. I would assume that it leads to inconsistent behavior when backpressure is released (enabling auto read) only in the topicPublishRateLimiterOnMessage rate limiter.
It feels that the 1 second rate limiting interval would have to be shared between the 2 different rate limiters to fix the inconsistencies when both limits are set. @aloyszhang WDYT?

@lhotari lhotari added this to the 2.8.0 milestone May 17, 2021
@lhotari lhotari added type/bug The PR fixed a bug or issue reported a bug area/broker labels May 17, 2021
@aloyszhang
Copy link
Contributor

aloyszhang commented May 17, 2021

It feels that the 1 second rate limiting interval would have to be shared between the 2 different rate limiters to fix the inconsistencies when both limits are set.

@lhotari Yes, inconsistent between publishThrottlingRateInMsg and publishThrottlingRateInByte may happend.
Totally agree with that publishThrottlingRateInMsg and publishThrottlingRateInByte should share one same rateTime and refresh ticket at the same time.

Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eolivelli
Copy link
Contributor

@aloyszhang @codelipenghui @merlimat PTAL

@lhotari lhotari requested a review from eolivelli July 23, 2021 13:24
@lhotari
Copy link
Member Author

lhotari commented Jul 23, 2021

@sijie @aloyszhang @codelipenghui I have rebased the changes. Please review this PR and #11442 . There are other PRs #11352 and #11372 that depend on these changes so it would be good to get this PR processed asap. Thanks!

…RateLimiterEnable=true

- disabling the limit didn't work after setting a limit because of a bug in PrecisPublishLimiter
@lhotari lhotari force-pushed the lh-fix-issue-when-using-preciseTopicPublishRateLimiterEnable branch from 9ffaa2d to c8aa9d2 Compare July 29, 2021 18:26
@lhotari
Copy link
Member Author

lhotari commented Jul 30, 2021

@sijie @aloyszhang @codelipenghui Please review. The changes have been rebased.
#11352 and #11372 were merged before this PR. This PR is needed for fixing issues with set-publish-rate when using preciseTopicPublishRateLimiterEnable=true. /cc @ronfarkash @danielsinai

@bharanic-dev
Copy link
Contributor

  • at the time of configuration change, if the connection was in a state where the 'read is disabled' (because of a previous rate limit), should the RateLimiter call autoReadResetFunction() in it's 'close' method? Otherwise, would the connection deadlock and forever remain in 'read disable' state?

@bharanic-dev Thanks for the feedback. I have added the call to the close method.

  • for PIP-82, one rate limiter will be shared by multiple topics. So, when a topic is unloaded, the ratelimiter should not be closed unless it was the last topic that was using it. So, instead of making 'PublishRateLimiter' 'Autoclosable', can you please consider adding a 'detach' method that can be called when the topic is getting unloaded?

@bharanic-dev I think it's better to handle this with composition instead of making a single class handle different use cases. With composition it becomes easy. You can have a wrapper that takes a PublishRateLimiter and handles the life cycle with reference counting. One possible solution is to use the org.apache.bookkeeper.mledger.util.AbstractCASReferenceCounted class as a base class for this kind of wrapper.

Thank you for taking care of this. LGTM.

@sijie sijie merged commit ded806f into apache:master Aug 3, 2021
LeBW pushed a commit to LeBW/pulsar that referenced this pull request Aug 9, 2021
…erEnable=true (apache#10384)

### Motivation

When using `preciseTopicPublishRateLimiterEnable=true` (introduced by apache#7078) setting for rate limiting, there are various issues:
- updating the limits doesn't set either boundary when changing the limits from a bounded limit to unbounded.
- each topic will create a scheduler thread for each limiter instance 
- each topic will never release the scheduler thread when the topic gets unloaded / closed
- updating the limits didn't close the scheduler thread related to the replaced limiter instance

### Modifications

- Fix updating of the limits by cleaning up the previous limiter instances before creating new limiter instances
- Use `brokerService.pulsar().getExecutor()` as the scheduler for the rate limiter instances
- Add resource cleanup hooks for topic closing (unload)

### Open issue 

The existing code has a difference in passing the `rateLimitFunction`:
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PrecisPublishLimiter.java#L80-L86
It's passed to the `topicPublishRateLimiterOnMessage`, but not to `topicPublishRateLimiterOnByte` . It is unclear whether this is intentional.
The `rateLimitFunction` is `() -> this.enableCnxAutoRead()`
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractTopic.java#L913
(This also raises a question whether rate limiting works consistently when multiple topics share the same connection.)
hangc0276 pushed a commit that referenced this pull request Aug 12, 2021
…erEnable=true (#10384)

### Motivation

When using `preciseTopicPublishRateLimiterEnable=true` (introduced by #7078) setting for rate limiting, there are various issues:
- updating the limits doesn't set either boundary when changing the limits from a bounded limit to unbounded.
- each topic will create a scheduler thread for each limiter instance
- each topic will never release the scheduler thread when the topic gets unloaded / closed
- updating the limits didn't close the scheduler thread related to the replaced limiter instance

### Modifications

- Fix updating of the limits by cleaning up the previous limiter instances before creating new limiter instances
- Use `brokerService.pulsar().getExecutor()` as the scheduler for the rate limiter instances
- Add resource cleanup hooks for topic closing (unload)

### Open issue

The existing code has a difference in passing the `rateLimitFunction`:
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PrecisPublishLimiter.java#L80-L86
It's passed to the `topicPublishRateLimiterOnMessage`, but not to `topicPublishRateLimiterOnByte` . It is unclear whether this is intentional.
The `rateLimitFunction` is `() -> this.enableCnxAutoRead()`
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractTopic.java#L913
(This also raises a question whether rate limiting works consistently when multiple topics share the same connection.)

(cherry picked from commit ded806f)
@hangc0276 hangc0276 added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Aug 12, 2021
michaeljmarshall pushed a commit that referenced this pull request Dec 10, 2021
…erEnable=true (#10384)

When using `preciseTopicPublishRateLimiterEnable=true` (introduced by #7078) setting for rate limiting, there are various issues:
- updating the limits doesn't set either boundary when changing the limits from a bounded limit to unbounded.
- each topic will create a scheduler thread for each limiter instance
- each topic will never release the scheduler thread when the topic gets unloaded / closed
- updating the limits didn't close the scheduler thread related to the replaced limiter instance

- Fix updating of the limits by cleaning up the previous limiter instances before creating new limiter instances
- Use `brokerService.pulsar().getExecutor()` as the scheduler for the rate limiter instances
- Add resource cleanup hooks for topic closing (unload)

The existing code has a difference in passing the `rateLimitFunction`:
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PrecisPublishLimiter.java#L80-L86
It's passed to the `topicPublishRateLimiterOnMessage`, but not to `topicPublishRateLimiterOnByte` . It is unclear whether this is intentional.
The `rateLimitFunction` is `() -> this.enableCnxAutoRead()`
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractTopic.java#L913
(This also raises a question whether rate limiting works consistently when multiple topics share the same connection.)

(cherry picked from commit ded806f)
@michaeljmarshall michaeljmarshall added the cherry-picked/branch-2.7 Archived: 2.7 is end of life label Dec 10, 2021
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jan 25, 2022
…erEnable=true (apache#10384)

When using `preciseTopicPublishRateLimiterEnable=true` (introduced by apache#7078) setting for rate limiting, there are various issues:
- updating the limits doesn't set either boundary when changing the limits from a bounded limit to unbounded.
- each topic will create a scheduler thread for each limiter instance
- each topic will never release the scheduler thread when the topic gets unloaded / closed
- updating the limits didn't close the scheduler thread related to the replaced limiter instance

- Fix updating of the limits by cleaning up the previous limiter instances before creating new limiter instances
- Use `brokerService.pulsar().getExecutor()` as the scheduler for the rate limiter instances
- Add resource cleanup hooks for topic closing (unload)

The existing code has a difference in passing the `rateLimitFunction`:
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PrecisPublishLimiter.java#L80-L86
It's passed to the `topicPublishRateLimiterOnMessage`, but not to `topicPublishRateLimiterOnByte` . It is unclear whether this is intentional.
The `rateLimitFunction` is `() -> this.enableCnxAutoRead()`
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractTopic.java#L913
(This also raises a question whether rate limiting works consistently when multiple topics share the same connection.)

(cherry picked from commit ded806f)
(cherry picked from commit 41ad624)
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…erEnable=true (apache#10384)

### Motivation

When using `preciseTopicPublishRateLimiterEnable=true` (introduced by apache#7078) setting for rate limiting, there are various issues:
- updating the limits doesn't set either boundary when changing the limits from a bounded limit to unbounded.
- each topic will create a scheduler thread for each limiter instance 
- each topic will never release the scheduler thread when the topic gets unloaded / closed
- updating the limits didn't close the scheduler thread related to the replaced limiter instance

### Modifications

- Fix updating of the limits by cleaning up the previous limiter instances before creating new limiter instances
- Use `brokerService.pulsar().getExecutor()` as the scheduler for the rate limiter instances
- Add resource cleanup hooks for topic closing (unload)

### Open issue 

The existing code has a difference in passing the `rateLimitFunction`:
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PrecisPublishLimiter.java#L80-L86
It's passed to the `topicPublishRateLimiterOnMessage`, but not to `topicPublishRateLimiterOnByte` . It is unclear whether this is intentional.
The `rateLimitFunction` is `() -> this.enableCnxAutoRead()`
https://github.com/apache/pulsar/blob/69a173a82c89893f54dbe5b6f422249f66ea5418/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/AbstractTopic.java#L913
(This also raises a question whether rate limiting works consistently when multiple topics share the same connection.)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-2.7 Archived: 2.7 is end of life cherry-picked/branch-2.8 Archived: 2.8 is end of life release/2.7.4 release/2.8.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
10 participants