Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve]Improve batching message doc #878

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

dao-jun
Copy link
Member

@dao-jun dao-jun commented Apr 5, 2024

Improve the batching message doc

Fixes: apache/pulsar#22439

✅ Contribution Checklist

@dao-jun
Copy link
Member Author

dao-jun commented Apr 5, 2024

image

@dao-jun dao-jun self-assigned this Apr 5, 2024
@dao-jun dao-jun marked this pull request as draft April 5, 2024 13:18
@dao-jun dao-jun marked this pull request as ready for review April 5, 2024 13:29
@dao-jun dao-jun requested a review from lhotari April 5, 2024 13:29
@@ -427,6 +427,13 @@ Consumer<byte[]> consumer = pulsarClient.newConsumer()
.subscribe();
```

:::note

Send messages by synchronous API `send` will disable batching, and the message will be sent individually.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment isn't accurate. The batching isn't disabled. The current batch is triggered immediately after sending the message. https://github.com/apache/pulsar/blob/ffff639a1b73a34bbb5115503d4c7783bb2a2770/pulsar-client/src/main/java/org/apache/pulsar/client/impl/TypedMessageBuilderImpl.java#L82-L86

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I know it, it's just that my statement is not accurate.

How about change it to:

Send messages by synchronous API `send` will trigger the batch to be sent immediately, even the batch is not full.
It is for the purpose of reducing the latency of sending messages and preventing blocking of the caller's thread.

:::note

Send messages by synchronous API `send` will disable batching, and the message will be sent individually.
It is for the purpose of reducing the latency of sending messages and preventing blocking of the producer thread.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not accurate. "producer thread" is vague in this case. is it an internal thread or what thread is it referring to? I guess a well known concept is "caller's thread" or "calling thread". However, the explanation would have be be better.

One explanation is simply that before the send message returns, the batch would have to be sent and the broker would have to return a message id for the sent message. In most usecases, no more messages could be added to the same batch since the caller thread is blocked so the decision has been made to simply trigger immediate sending of the message when the synchronous API is used.
I'm not exactly sure how to put this in the docs.

@lhotari
Copy link
Member

lhotari commented Apr 5, 2024

Fixes: apache/pulsar#22439

I don't think that the documentation alone is a sufficient resolution. The Javadoc of the Pulsar Java client needs to be updated too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants