Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redistribute RDY among NSQD connections #380

Open
heipei opened this issue Mar 1, 2022 · 3 comments
Open

Redistribute RDY among NSQD connections #380

heipei opened this issue Mar 1, 2022 · 3 comments

Comments

@heipei
Copy link

heipei commented Mar 1, 2022

I think it would be great to have an option for nsqjs that could redistribute the RDY count between multiple connected nsqds based on their channel depths. I frequently run into the problem that I need to set a hard "global" max-in-flight value, but I'm running multiple nsqds and sometimes just one of them will be bursting while the others don't get any messages.

The same thing was discussed in go-nsq, unfortunately without an implementation yet. Just food for thought!

nsqio/go-nsq#179
nsqio/go-nsq#277

@dudleycarr
Copy link
Owner

RDY management as pointed out is tricky but doable. I agree that it's not entirely satisfactory and it would be nice to provide some options on how that's handled between connections.

Considering channel depths would be one heuristic for allocating RDY counts. The biggest issue is that nsq protocol gives no information about channel depths -- that would have to be requested over HTTP to the nsqd. Not exactly ideal.

An alternative consideration would be to allocate RDY count based on the rate of messages coming from nsqds. This would allow reallocating RDY count to a busy channel provided the other nsqds are essentially idle. I assume this would cover your situation?

I'm open to making changes to allow different RDY strategies.

@heipei
Copy link
Author

heipei commented Mar 1, 2022

Thanks for the prompt reply, and thanks for explaining the intricacies involved, I wasn't aware that the nsq protocol doesn't expose channel depth. I wanted to bring this up, but I won't be able to implement it unfortunately.

The approach of measuring rate of messages seems like a compromise, but I could also see it as problematic. If all RDY counts are assigned to a single nsqd, you still have to go around the other "idle" nsqd frequently and set their RDY count to at least 1. I'm sure there are some scenarios where this will have unexpected / oscillating behaviour, but can't really articulate those right now.

@dudleycarr
Copy link
Owner

If the max_in_flight is less than the number of nsqd instances, then you have the situation where the client has to switch between nsqd connections. Usually, this can should be considered to be a bad configuration/architecture.

While thinking through alternate strategies for redistributing RDY across nsqd instances, it did occur to me "idle" connections could be treated specially. Instead, "idle" connections no matter what would always have a RDY count of 1. If that connection receives a message, it could requeue the message and then rebalance the RDY counts across non-idle nsqd connections. Effectively, it allows the client to peak for messages.

The benefit is that in both the "bad configuration" described above as well as other alternate RDY count allocation strategies, the client wouldn't have to burn RDY counts on idle connections or have to deal with added latency for connections temporarily set with RDY to 0.

There are a couple of downsides:

  • requeue would increment the number of attempts on the message. This is problematic for clients who care about the number of attempts.
  • requeue also places the message at the back of the channel queue, as I understand the nsqd implementation. Ordering is not guaranteed but just seems undesirable.

@mreiferson Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants