Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scheduling capabilities to EnhancedQueueExecutor #111

Merged
merged 1 commit into from Jun 29, 2022

Conversation

dmlloyd
Copy link
Contributor

@dmlloyd dmlloyd commented Jun 28, 2022

This is an initial prototype with the following features:

  • Enhance EnhancedQueueExecutor to implement ScheduledExecutorService directly
  • Use a dedicated scheduling thread
  • The scheduled work queue is a simple sorted ArrayList (I expect this to perform substantially better than e.g. TreeSet up to a certain size... but I don't know what size that might be) [UPDATE: Just using a TreeSet for now, until a more optimal array-based structure can be thoroughly vetted]
  • Uses the internal Task structure to capture the calling context and also to avoid creating more wrapper objects than strictly necessary
  • Uses thread monitor wait/notify which may only have millisecond resolution on some JDKs [UPDATE: Now using ReentrantLock for the main queue; for the Future itself it doesn't matter as much]
  • Forbids comparison via Delayed interface with any other implementations
  • Tasks with same execution time compare to 0 [UPDATE: Each task now has a uniquifying sequence number]
  • Manually tested some basic cases [UPDATE: There are unit tests now]
  • Malfunctions after about 292 years of running

I want to pursue (at a minimum) the following before finalizing the PR:

  • Lazily start the scheduling thread
  • Make the scheduled work queue automatically switch from ArrayList to e.g. TreeSet (or something better that is still no worse than O(log n) for insertions and O(log n) for head removals) once it reaches a certain size
  • Pre-size the ArrayList to a logical fraction of the "certain size" so that it resizes exactly zero, one, or two times before switching to an alternative data structure
  • Maybe move to ReentrantLock/Condition for the work queue so that tasks can wait with higher resolution than millisecond
  • Investigate aggressive removal from the scheduled work queue on cancellation (maybe do this as a later enhancement)
  • Consider introducing a serial number so that every task sorts uniquely
  • Introduce comprehensive unit tests
  • Get feedback from @jponge and @cescoffier

@jponge
Copy link

jponge commented Jun 28, 2022

Thanks @dmlloyd I'll be watching this 👍

@franz1981
Copy link
Contributor

franz1981 commented Jun 28, 2022

You can take look at hashed timing wheel to track scheduled tasks efficiently eg https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/HashedWheelTimer.java

@dmlloyd
Copy link
Contributor Author

dmlloyd commented Jun 28, 2022

You can take look at hashed timing wheel to track scheduled tasks efficiently eg https://github.com/netty/netty/blob/4.1/common/src/main/java/io/netty/util/HashedWheelTimer.java

I did think of this implementation, and pondered some potential modifications of it. At the moment I am considering precise timing to be more important than throughput, but that's just my initial assumption; we'll have to see what kind of usage pattern there is. It could potentially be made configurable.

@carterkozak
Copy link
Contributor

At the moment I am considering precise timing to be more important than throughput, but that's just my initial assumption

Out of curiosity, what are your goals for this implementation? Which trade-offs we're making against other implementations, and use-cases that are/aren't in scope.
I'm enamored with the idea of an unbounded (cached) scheduled executor implementation where I don't have to worry about saturation preventing critical tasks from being scheduled.

@dmlloyd
Copy link
Contributor Author

dmlloyd commented Jun 28, 2022

At the moment I am considering precise timing to be more important than throughput, but that's just my initial assumption

Out of curiosity, what are your goals for this implementation? Which trade-offs we're making against other implementations, and use-cases that are/aren't in scope. I'm enamored with the idea of an unbounded (cached) scheduled executor implementation where I don't have to worry about saturation preventing critical tasks from being scheduled.

The primary motivation is to avoid having to maintain a separate scheduled thread pool executor for Quarkus. My initial design points are:

  • Use the main pool for task execution, so that tasks do not clog up the single scheduler thread. This BTW is a requirement from outside - Mutiny on Quarkus to be exact. I think this supports your idea of avoiding saturation (though I'd like to see real-world numbers support this before I act too certain).
  • Keep a consistently low cost for task insertion. I know ArrayList doesn't seem to support this goal, but it is well known that at small sizes, the O(n log n) insertion time of a sorted ArrayList is faster in practice than the O(log n) insertion time of TreeSet (the other low-hanging-fruit-ish option for the task queue). It should also always be faster in the event of insertion at or near the end of the list. Once the schedule queue reaches a certain size though, the situation will likely change and TreeSet will become consistently better for arbitrary sizes. (Note that I also considered ArrayDeque so that it would be faster at insertions at the head of the queue as well - but, inserting into ArrayDeque is considerably more complex because it does not support random-access insertion (even though it supports semi-random-access deletion via its iterator 🙄), so it must be partially drained before the insertion, and the drained elements replaced afterwards, so it is unlikely to perform better than either other option under any circumstances but the most trivial.)
  • Run tasks in as close to real time as feasible. Using a separate thread for scheduling means that the scheduler thread has little to do other than to dispatch tasks. As long as the main pool is configured appropriately, and the limits of the running system itself are not being strained, then task dispatch should always be pretty timely.
  • Meet the contract of ScheduledExecutorService. Even if it is a tad ridiculous at points (see the comparison behavior of Delayed for example).
  • Use EQE directly for this purpose; adding ScheduledExecutorService to the implements list only introduces four more methods, whereas using a separate decorator pattern requires some fairly complex additional support.

@dmlloyd dmlloyd force-pushed the eqe-scheduled branch 4 times, most recently from f4dfcf1 to 67d330a Compare June 28, 2022 14:30
@dmlloyd
Copy link
Contributor Author

dmlloyd commented Jun 28, 2022

I think I'm also going to explore a simple array-backed double-ended queue with random-access insert and remove (the latter to support cancellation, potentially).

Copy link

@jponge jponge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great from the Mutiny perspective 👍

@dmlloyd dmlloyd marked this pull request as ready for review June 29, 2022 14:59
@dmlloyd
Copy link
Contributor Author

dmlloyd commented Jun 29, 2022

Of course, GitHub Actions is having problems now. Well, I'll come back to this later once CI completes.

@dmlloyd dmlloyd merged commit 88186b9 into jbossas:main Jun 29, 2022
@dmlloyd dmlloyd deleted the eqe-scheduled branch June 29, 2022 16:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants