Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional name parameter to .limitedParallelism #4106

Merged
merged 2 commits into from Apr 29, 2024

Conversation

qwwdfsad
Copy link
Member

Fixes #4023

@qwwdfsad
Copy link
Member Author

Conceptually, the PR is ready.

Practically, I would like to polish and figure some binary compatibility details before merging it and ensure it is a reasonable thing to do in a non-major release

assertEquals("filesDispatcher", Dispatchers.IO.limitedParallelism(1, "filesDispatcher").toString())
assertEquals("json", Dispatchers.Default.limitedParallelism(2, "json").toString())
// Not overridden at all, limited parallelism returns `this`
assertEquals("DefaultExecutor", (DefaultDelay as CoroutineDispatcher).limitedParallelism(42, "ignored").toString())
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the reasons to take the non-orthogonal path here? I think it's much more intuitive to have names that are unconditionally applied.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My line of thinking is the following:

  • The implementation allows returning this which is a nice property to have -- you don't have to care about the underlying/given pool capacity if not necessary
    • Also, mistakes are much more forgiving. E.g. now for cases like that nothing really happens, with the name forcing to return something else it's suddenly more subtle
  • I don't really like the idea of being able to give dispatchers names when no new dispatcher is effectively created -- the probability of accidental aliasing just increases with not that many benefits
  • Returning LimitedDispatcher for the sole purpose of naming is a suboptimal idea -- it slightly changes the observable behaviour and introduces some overhead. It kind of forces us to introduce one more implementation (NamedDispatcher?) and maintain it

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The implementation allows returning this which is a nice property to have

This can be preserved in the case of parallelism >= dispatcher.parallelism && (newName == null || newName == dispatcher.name).

I don't really like the idea of being able to give dispatchers names when no new dispatcher is effectively created -- the probability of accidental aliasing just increases with not that many benefits

Two bureaucratic benefits come to mind immediately:

  • Principle of least surprise: specifically, if the user did provide a name for some reason, it's surprising to see it ignored.
  • Principle of orthogonality: conceptually, now, the two arguments to limitedParallelism are linked together, while they could not be that.

The second concern is practical, I think. Let's look at the issue statement:

We have specific dispatchers, and in coroutine dumps they all look like LimitedDispatcher@xxxxxxxx, and it's not immediately clear where the dispatcher comes from.

I see two possible cases:

  1. "Dispatchers.Default.limitedParallelism(4)" is already good enough. Then why would we add a new name parameter at all?
  2. "Dispatchers.Default.limitedParallelism(4)" is not good enough, and something like IndexBuilding is needed instead. In this case, why does the name revert to the not-good-enough "Dispatchers.Default.limitedParallelism(4)" option when there are few CPU cores?

As to the accidental aliasing issue: what dangers do you see in it?

Returning LimitedDispatcher for the sole purpose of naming is a suboptimal idea -- it slightly changes the observable behaviour and introduces some overhead. It kind of forces us to introduce one more implementation (NamedDispatcher?) and maintain it

That's true. Why is this an issue, though? Is an extra call stack frame per dispatch a significant problem?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: discussed it internally and the rationale will be reflected in the public design doc

Base automatically changed from limited-parallelism-improvements to develop April 26, 2024 10:24
Copy link
Collaborator

@dkhalanskyjb dkhalanskyjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LimitedDispatcher having its own name instead of always delegating this behavior to a NamedDispatcher is a nice idea.

There's still potential for tiny optimizations: for example, we could also override limitedParallelism on NamedDispatcher to allow in-depth inspection of dispatcher (throwing away the NamedDispatcher wrapper if it's no longer needed), or we could join nested NamedDispatchers into one, etc.

I'm not sure, though, that this would be useful to anyone, and certainly not to a point where this would block adding this useful change.

if (parallelism >= this.parallelism) return this
return super.limitedParallelism(parallelism)
if (parallelism >= this.parallelism) return namedOrThis(name)
return super.limitedParallelism(parallelism, name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Minor) Since checkParallelism was already called, we could avoid calling super here, instead directly constructing a LimitedDispatcher.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In things like that, I prefer leaving the double-check, but ensuring that the code is reused and has less entry point

@qwwdfsad qwwdfsad merged commit dfbd4a8 into develop Apr 29, 2024
1 check passed
@qwwdfsad qwwdfsad deleted the limited-parallelism-name branch April 29, 2024 15:27
Corje pushed a commit that referenced this pull request May 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants