Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[issue #13351] Solving precise rate limiting does not takes effect #11446

Merged
merged 7 commits into from Jul 28, 2021
Merged

[issue #13351] Solving precise rate limiting does not takes effect #11446

merged 7 commits into from Jul 28, 2021

Conversation

danielsinai
Copy link
Contributor

Master Issue: #11351

Motivation

This PR is fixing the 1,2 problems as described in #11351.

There was a bigger Pull request that I published #11352, but I closed it in due to being too big and lacked explaination for what is actually solves

Reproduce

  • Allow precise rate limiting

  • Create a topic

  • Limit publish rate of messages per second to 10

  • With producer perf write 100 messages per second

    Results after this PR:

image

befoe this PR precise publishRate limiting wasn't taking effect at all

Modifications

In order to solve the current problems, there are 2 modifications

  1. Using IsDispatchRateLimiting in precise publish rate limiter as well (in order to starve the producer)
  2. Checking if there are available permits before resetting the read from the connection again

Verifying this change

Already covered by current tests.

Does this pull request potentially affect one of the following parts:

  • Dependencies no
  • The public API: no
  • The schema: no
  • The default values of configurations: no
  • The wire protocol: no
  • The rest endpoints: no
  • The admin cli options: no
  • Anything that affects deployment: no

Documentation

For contributor

For this PR, do we need to update docs? Probably not, it is just fixing the current implementation

if (permitUpdater != null) {
long newPermitRate = permitUpdater.get();
if (newPermitRate > 0) {
setRate(newPermitRate);
}
}
if (rateLimitFunction != null) {
if (rateLimitFunction != null && this.getAvailablePermits() > 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to make sense. Please add some comments here explaining the logic.

Copy link
Member

@lhotari lhotari Jul 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd assume that the condition would need to handle the case where isDispatchOrPrecisePublishRateLimiter is false.
It seems that rateLimitFunction.apply would never be called in that case.
Did you think about that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea Ill add a comment.

And actually i didn't give it much thought but when I'm thinking about it the rateLimitFunction is a callback that lets the outer scope access to the renew function, I don't think that we should use this property without knowing exactly what expected to be happening.

I believe that checking whether there are available permits is the reasonable condition here because we would want to let the outer scope a way to run something when there are any available permits, and it doesn't really depend on the class property.

If it sets to false we can assume the user of the rateLimiter wants the state to be reset every time window otherwise he probably want back-pressuring something.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Member

@lhotari lhotari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work @danielsinai ! Thanks for the contribution. LGTM.

@danielsinai
Copy link
Contributor Author

The failed test actually found a bug that the first tryAcquire will always pass because the condition is running before the acquiredPermits changed, and that will let the producer to write 2x above the limit in the first try.

In addition there are some tests that should failed like the test that fills all the permits in the first try - currently it asserting to true but actually we should assert it to false in order to throttle the producer in the first time the maximum permits are set.

Ill publish a fix soon

@@ -104,9 +104,13 @@ public void testPrecisePublishRateLimiterAcquire() throws Exception {

// tryAcquire msgSizeInBytes exceeded
assertFalse(precisPublishLimiter.tryAcquire(10, 101));
Thread.sleep(1100);
Thread.sleep(2100);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a potential flaky test, and we cannot guarantee that it will be scheduled after 2100ms. Manual control of renew should be better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change it, didn't want to touch the current implementation too much.

But will do 👍🏾

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you like me to turn the rateLimiters within the precisPublishLimiter to public in order to call the renew function?

Im afraid that it can result others to try and access the rateLimiters state, Oppositely to OOP mutable non-shared state approach

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can use reflection.
Stop the scheduled task first
private ScheduledFuture<?> renewTask;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh sounds good, will do thanks for clarifying

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done 👍🏾

if (permitUpdater != null) {
long newPermitRate = permitUpdater.get();
if (newPermitRate > 0) {
setRate(newPermitRate);
}
}
if (rateLimitFunction != null) {
// release the back-pressure by applying the rateLimitFunction only when there are available permits
if (rateLimitFunction != null && this.getAvailablePermits() > 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this is necessary
Changed existing behavior

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is necessary, without it every renew call the function will release the throttle.

Not having it defeats the whole purpose of throttling a connection.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that still relevant?
And may you review the test refactor please?

@danielsinai
Copy link
Contributor Author

/pulsarbot run-failure-checks

Copy link
Contributor

@315157973 315157973 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please move the code from setup to unit test so that it will not affect other unit tests

@danielsinai
Copy link
Contributor Author

@315157973 yes ofc done 👍🏾

@315157973
Copy link
Contributor

Thanks. Let's wait for CI to pass

@danielsinai
Copy link
Contributor Author

@315157973 Seems like everything passed 👌😄

@sijie sijie added this to the 2.9.0 milestone Jul 28, 2021
@sijie sijie added release/2.8.1 type/bug The PR fixed a bug or issue reported a bug labels Jul 28, 2021
@sijie sijie merged commit 00ad07d into apache:master Jul 28, 2021
codelipenghui pushed a commit that referenced this pull request Jul 30, 2021
…11446)

![image](https://user-images.githubusercontent.com/51213812/126812923-91bb827c-246d-451d-8f25-343bb2c1dca0.png)

befoe this PR precise publish rate limiting wasn't taking effect at all
### Modifications

In order to solve the current problems, there are 2 modifications

1. Using IsDispatchRateLimiting in precise publish rate limiter as well (in order to starve the producer)
2. Checking if there are available permits before resetting the read from the connection again

### Verifying this change

Already covered by current tests.

(cherry picked from commit 00ad07d)
@codelipenghui codelipenghui added the cherry-picked/branch-2.8 Archived: 2.8 is end of life label Jul 30, 2021
michaeljmarshall pushed a commit that referenced this pull request Dec 10, 2021
…11446)

![image](https://user-images.githubusercontent.com/51213812/126812923-91bb827c-246d-451d-8f25-343bb2c1dca0.png)

befoe this PR precise publish rate limiting wasn't taking effect at all

In order to solve the current problems, there are 2 modifications

1. Using IsDispatchRateLimiting in precise publish rate limiter as well (in order to starve the producer)
2. Checking if there are available permits before resetting the read from the connection again

Already covered by current tests.

(cherry picked from commit 00ad07d)
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request Jan 26, 2022
…ect (apache#11446)

![image](https://user-images.githubusercontent.com/51213812/126812923-91bb827c-246d-451d-8f25-343bb2c1dca0.png)

befoe this PR precise publish rate limiting wasn't taking effect at all

In order to solve the current problems, there are 2 modifications

1. Using IsDispatchRateLimiting in precise publish rate limiter as well (in order to starve the producer)
2. Checking if there are available permits before resetting the read from the connection again

Already covered by current tests.

(cherry picked from commit 00ad07d)
(cherry picked from commit 06c6adf)
bharanic-dev pushed a commit to bharanic-dev/pulsar that referenced this pull request Mar 18, 2022
…ect (apache#11446)

![image](https://user-images.githubusercontent.com/51213812/126812923-91bb827c-246d-451d-8f25-343bb2c1dca0.png)

befoe this PR precise publish rate limiting wasn't taking effect at all
### Modifications

In order to solve the current problems, there are 2 modifications

1. Using IsDispatchRateLimiting in precise publish rate limiter as well (in order to starve the producer)
2. Checking if there are available permits before resetting the read from the connection again

### Verifying this change

Already covered by current tests.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/broker cherry-picked/branch-2.7 Archived: 2.7 is end of life cherry-picked/branch-2.8 Archived: 2.8 is end of life release/2.7.4 release/2.8.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants