Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting unlimited maxCapacity, int vs uint #55

Open
Kaszanas opened this issue Mar 24, 2024 · 2 comments
Open

Setting unlimited maxCapacity, int vs uint #55

Kaszanas opened this issue Mar 24, 2024 · 2 comments

Comments

@Kaszanas
Copy link

Background

I have a system where I will be using pond for network related download tasks.
To avoid hitting rate limiters when issuing multiple network requests it would be great to be able to set unlimited task capacity.

Imagine a scenario where there are some downloads that are already finished and the data is easily accessible so there is no need to issue another download.

In such case a broader worker pool can have as much goroutines as number of CPU threads. But then you wouldn't like to have that for a worker pool downloading because you'd have potentially 24 - 56 threads attempting that at once.

Question

Is there a way to have unlimited tasks queue and just 3-5 workers downloading at the same time? I was not able to find such example in the documentation and the code.

Would this mean that it should be set to the maximum integer capacity?

const MaxInt = int(MaxUint >> 1) 

Int vs Uint?

Additionally given that these values cannot be lower than zero:

	// Make sure options are consistent
	if pool.maxWorkers <= 0 {
		pool.maxWorkers = 1
	}
	if pool.minWorkers > pool.maxWorkers {
		pool.minWorkers = pool.maxWorkers
	}
	if pool.maxCapacity < 0 {
		pool.maxCapacity = 0
	}
	if pool.idleTimeout < 0 {
		pool.idleTimeout = defaultIdleTimeout
	}

Wouldn't it be better to have these types be uint?

@alitto
Copy link
Owner

alitto commented Mar 24, 2024

Hey @Kaszanas!

Unlimited task queues

This library doesn't currently have support for setting up unlimited task queues. This was an intentional design decision for the sake of simplicity and efficiency. The task queue is currently implemented as a fixed-size buffered channel, which takes a constant space in terms of memory and minimizes the number of memory allocations when inserting or removing tasks.
I guess there could be situations where it would make sense to have an unbounded task queue, but I see a few potential issues with that approach:

  • Available memory is always finite, so any unlimited data structure will eventually eat up all available memory if it grows indefinitely. This can quickly cause an Out-Of-Memory error in a Golang program if there is a big spike of tasks being queued and workers are too slow to process them.
  • When using an unlimited data structure, you lack a mechanism to throttle/slow down writers (e.g. goroutines trying to queue a task) when backpressure increases. Go's buffered channel, on the other hand, causes writer goroutines to wait if the buffer is full. Although this may slow down a particular task, it is really helpful because it avoids running into Out-Of-Memory errors and ensures running worker goroutines are not affected. Keep in mind that having a high number of goroutines queueing tasks may steal CPU cycles from worker goroutines, which means processing rate will slow down (vicious cycle).

That said, I would encourage you to run some stress/load tests to determine a reasonable value for the maxCapacity parameter (size of the buffered channel). I woudn't recommend using math.MaxInt since that will cause the process to block the entire available RAM and crashing immediately 😅.

Int vs Uint?

You are right, these options could have been uint instead of int. I don't recall why I picked int to be honest, maybe it had to do with minimizing the number of int conversions. It's probably too late to change this though, since that could break existing clients of this library 🙃

@Kaszanas
Copy link
Author

Thanks for the swift reply @alitto yeah I only wanted to have an unlimited task queue because it can get filled quicker than processing happens. In that case it could be hard to foresee what should be it's size.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants