-
Notifications
You must be signed in to change notification settings - Fork 683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
APScheduler 4.0 progress tracking #465
Comments
The master branch is now in a state where both the async and sync schedulers work, albeit with a largely incomplete feature set. Next I will focus on getting the first implementation of shareable data stores, based on asyncpg. I've made some progress on that a while back but got sidetracked by other projects, particularly AnyIO. |
Regarding Twisted scheduler on the chopping block for APScheduler v4. My main OSS project is a multi-process app, that spins up many Twisted reactors in those processes, where several of the sub-processes use APScheduler inside the reactor (https://github.com/opencontentplatform/ocp). What would be a safe replacement scheduler if the twisted version is being removed? |
So you run multiple schedulers? Are you sharing job stores among them? The main reason I'm thinking of dropping (explicit) Twisted support is because it carries a heavy burden of legacy with it. I will play around with it and see if I can make it work at least with the asyncio reactor. If it can be made to work with a small amount of glue, I will take it off the chopping block. |
Yes, it runs multiple instances of the schedulers - with their own independent job stores. I understand the need for software redesigns, and I'm certainly not pushing back or trying to make more work for you. Just trying to understand what the recommendation would be. Maybe I could fall back to using APS' BackgroundScheduler since I don't spin it up until after the reactors are running? Either way, I saw the note and want to ensure I follow whatever happens on that one. Either way, thank you for the solid project. |
Are the jobs you run typically asynchronous (returning Deferreds) or synchronous (run in threads)? |
The initial setup with creating job definitions is synchronous. Any updates to previous job definitions or newly created jobs (stored/managed in a DB) occur regularly in an asynchronous manner (LoopingCall that returns a Deferred). And all the work with job runtime (execution/management/reporting/cleanup) occurs in non-reactor threads. |
Ok, so it sounds like the actual job target functions are synchronous, correct? Then you would be able to make do with the synchronous scheduler, yes? |
If you're saying so, then yes. I defer to your knowledge there. I selected with TwistedScheduler since the user guide choosing-the-right-scheduler section said to do so when building a Twisted application. I apologize for compounding the response with a question, but it's related. How is the thread pool and thread count handled if I use something other than the TwistedScheduler? Will the job run inside Twisted's thread pool, or inside BackgroundScheduler's thread pool? Do I need to extend both? Does constructing the BackgroundScheduler with an explicit max_workers count (example below), do anything when it's running inside the Twisted's reactor? self.scheduler = BackgroundScheduler({ |
The sync scheduler (including 3.x's I want to provide first class async support in APScheduler 4.x. If I can do that with Twisted without having to create an entire ecosystem of Twisted specific components, then I'm open to doing that. |
I just added a few items to description:
|
What do you think about adding optional OpenTelemetry support? |
I am open to it, but only as soon as their API stabilizes. As it stands, every beta release breaks backward compatibility. I have more important issues to work on. I don't think v4.0 will have OpenTelemetry support but I will consider adding it to a minor update release once they are in GA. |
A lot of progress has been made on the core improvements of v4.0. Vast code refactorings have taken place. The data store system is really taking shape now. I've added "Failure resilience for persistent data stores" to the task list. It's one of the most frequent deployment issues with APScheduler, so I'm making sure that it's adequately addressed in v4.0. I'm not sure what to do with the event system. I may rip it out entirely until I can figure out exactly how it should work. I know users will want to know when a job completes or a misfire occurs etc., so it will be implemented in some form at least before the first release. I will post another comment when I've pushed these changes to the repository. |
I hit a snag with the synchronous version of the scheduler. I tried to use the AnyIO blocking portal system to run background tasks but I had to conclude that it won't work that way. I have an idea for that though. |
@agronholm do you have any estimate when 4.0 would be released? |
I had hoped at least for an alpha at this point, but the design problems in the sync version killed the momentum I had. I have not done any significant F/OSS development since. I am still committed to getting 4.0 done, but due to pressure at work I don't think I can work on it before Christmas holidays. |
@agronholm How will you make the jobstore can be shared among multiple schedulers? |
By coordination and notifications shared between schedulers. Notifications are optional but recommended, and without notifications the schedulers will periodically check for due schedules. How all this works is specific to each store implementation. |
Hello @agronholm Impressive task list and thanks for apscheduler. By big christmas whish is "locking" (probably the idea of persistent storage) I use apscheduler on several web nodes each node had some workers. Today, I inherit scheduler, store etc to add locking. Instead of using For me it's mandatory that a Task never belong to a worker, the job must be in queue then another worker or himself could process that task. To achieve it I added in redis (like jobs and running keys) "ready", "locked", "dead", "failed", "done"
I'm a big fan of Sidekiq (and also Faktory) And I will be very happy with something like In the "main"
Then in code
Why not Celery ? I don't wan't to setup full celery/flower stuffs, my tasks are simple and I'm a bit lazy to repackage an entire app or split into small libs some line of codes just to allow celery running my code (and also split config, creds etc) Don't know if I'm clear (not native english) |
@ahmet2mir APScheduler 4.0 already has the proper synchronization mechanisms in place. What's still missing is the synchronous API. I've come to a realization that I cannot simply copy the async API and remove the |
While 4.0 is being worked on, I've gone back to the 3.x branch for a bit and fixed a number of bugs and other annoyances. |
Tests on async/sync workers (formely: executors) are passing now, but the sync worker tests are strangely slow and I want to get to the bottom of that before moving forward. |
Slowness in worker tests resolved: it was a race condition in which the notification about the newly added job was sent before the listener was in place, causing the data store to wait for the 1 second timeout to expire before checking for new jobs again. I'll move on to completing the synchronous scheduler code now. I'm also very close to releasing AnyIO v2.1.0 which is a critical dependency for APScheduler 4. |
I can't wait... |
v4 is looking really good. I like the data model for the data store - huge improvement What is the tags feature exactly? I'm wondering if it's related to a feature that I'm looking for - allow workers to only run certain jobs, similar to Celery Queues |
Yeah, exactly that. For example, it will let you queue Windows-only jobs that only schedulers on Windows nodes will pick up. I haven't worked out the details yet, like how exactly job tags should match with schedulers, but this is the general idea. |
Great @agronholm - you don't have an issue working on it, right? I might be able to contribute |
Help would be appreciated in the planning phase. Writing down just how tag matching should work would be great. For example, do we allow operators like < or > for the fields? Let's say we want to tag a job to only run on Python >= 3.11. Would that be a sensible use case, and how exactly would that work? |
Speaking of issues, I just created one where we can discuss this further without pinging everybody: #798 |
Hey @agronholm ! I'm very pleased for new release and I'm waiting for the new documentation. scheduler.add_job(
patch_mysql_connection,
args=[instance.submit],
trigger='date',
run_date=publish_date,
id=f'newsletter_job_{instance.id}',
replace_existing=True,
) My question applies also to the job rescheduling e.g.: scheduler.reschedule_job(
f'newsletter_job_{instance.id}',
trigger='date',
run_date=publish_date,
) How we now pass date argument in new |
I'll be happy to help but please create a new Q/A discussion for this. The short answer is that APScheduler 4 no longer uses entry points (this became very problematic for PyInstaller and other standalone packagers). Therefore you need to pass a trigger instance as |
So, I've been hard at work on APScheduler again this weekend. In the process, I've implemented task configuration, enabling users to create schedules targeting lambdas and other un-referenceable callables. I also substantially increased test coverage of the scheduler code, which then led me to discover (and fix) some issues. One annoying issue was that MongoDB's datetimes are only accurate to the millisecond, not microsecond, so I had to come up with a workaround to deal with that. I have no illusions about APScheduler's ability to achieve this precision in reality, but the what were my options? I could accept the inaccuracy, but then users would wonder about the discrepancy when their datetimes don't match anymore after coming back from the DB. I probably haven't even thought about all the subtle issues that would create. Anyway, I'm nearing the point where the latest batch of changes is ready to be pushed, once I've verified that class methods, static methods and instance methods can also be properly used as task callables. This will likely happen on the next weekend. Given the number of breaking changes I've had to make in the process, I'm considering releasing another alpha if there's demand for that. |
Just to break the radio silence: I managed to get explicit task configuration done, with support for just about any callable, so that's 2 out of 4 tasks done from this comment. The reason I haven't pushed those changes yet is that I stumbled onto an annoying Heisenbug which causes a database connection to sometimes remain in the pool even when the scheduler has been stopped. My debugging efforts so far haven't yielded any answers, but I will continue to try and fix this. |
Wow, I finally figured it out! Turns out that my transactional context manager for the SQLAlchemy data store wasn't quite as airtight as I thought: the transaction is started in a worker thread, but the corresponding exit operation is cancelled before it can take place, so the connection is never returned to the pool. |
👍 |
You can't do async I/O when you're using a synchronous database driver like pymysql or cx-oracle. I also have reason to believe that async drivers could also be affected by this, although the tests don't show it. |
I've pushed the changes now. Depending on the workload caused by the other two issues, I may release an interim alpha with all the fixes up to this point. |
Question about the state of stateful. I see you've checked stateful triggers but not stateful jobs. What is a stateful trigger? Currently it isn't possible to share state between jobs, right? E.g. a database connection. |
Stateful triggers contain state which is saved after the trigger is used to calculate new fire times for a schedule. All triggers are stateful in APScheduler 4. This was necessary in order to correctly implement combination triggers ( |
Got it. And it isn't possible to share state between jobs on the same worker currently, right? E.g. I want to reuse a database connection for a schedule (and then close it once the schedule is "done"). Maybe it can be done via events now that I think of it. |
For schedules, "stateful" means that its jobs retain some internal state which is then saved after the execution of the job. Sharing database connections is out of scope anyway since you can't serialize them. |
I've released another alpha, with tons of fixes/workarounds for less capable RDBMS (sqlite, mysql). Explicit task configuration is also in there. As usual, this update requires wiping your data store and starting over. |
Thank you so much Alex, I just started using 3.x, do you see any specific date around a production ready 4.x release? @agronholm |
I'm sure I can get a beta out before the end of the year (I am furloughed most of December so I have plenty of time to work on APScheduler), but production? That depends on what issues come up in testing. Q2/2024? Not impossible at least. |
Thank you so much for being transparent, really appreciate your community effort <3 |
Some good news again. I'm making significant progress on the cleanup feature which periodically purges expired job results, and now also finished schedules which are no longer purged right after the last job is submitted to the store. With luck, I can push these changes to GitHub this weekend. I've also opened two discussions I would like your input on: |
The automated cleanup is now in. That's 3 out of 4 blockers completed for the beta release. My idea of a "beta" release is that it's feature complete but may still contain bugs. I would like to get the data store schema settled so that there won't be any need for nuking the data stores after an upgrade to a newer beta. To that end, the first bullet point of my previous comment needs to be addressed ASAP. In the absence of any feedback on that issue, my plan is to introduce dynamic fields and to correspondingly reduce the number of columns to only those that need to be indexed and queried against. The last blocker is now the implementation of maximum running jobs limits. Ideally there would be two levels of such limits: task and schedule level. The total number of jobs with the same task ID would never be allowed to rise above the task-level limit, and the number of jobs with the same schedule ID would never exceed the schedule-level limit. The promised import/export feature will likely not land in the first beta, but probably in the second one. |
Alright, so we're in 2024 now. I know what I said about the beta, but I got sidetracked by two other projects of mine that needed urgent work on them. There will be another alpha as soon as:
|
Hey @agronholm , wanted to ask whether it would be helpful to submit issues for bugs in the 4.0.0 releases at this point, or if it would just be best to wait? I'd like to move my system to 4.0.0 because it has some settings that 3.10.9 does not, but I'm running into some problems here and there. Thanks again. |
It might be helpful, but remember that it's still in alpha state for a reason. |
Hey, I am really loving the new version. it is a lot easier to use when compared to the other options available. I am using AsyncScheduler. Diving into the code, I can see why it is a lot of work to get this version released to the world! I will take a look at the issue below and see if I can make some changes. I also noted that the latest commits may address a few of these issues. I read through much of the discussion, but thought it prudent to add my own thoughts on v4.0.0a4:
Workaround, only for scheduling; if you are manually running or adding jobs, this will fail to help you. of course, i think there is another fix, referenced here.
|
Hi, it's been a while! I just released a new alpha with a metric ton of fixes, and a handful of new features too! Importantly, data stores now finally have a clean-up procedure which will remove expired job results and finished schedules. Schedules can now also be paused and unpaused (contributed by @WillDaSilva). This restores a 3.x series feature in an even more powerful form. Kudos to the 3 people who contributed fixes too! As usual, the data store schemas have changed in a backwards incompatible manner, so you need to start from scratch when updating. This should stop happening once the beta is out. |
There are tests making sure |
I'm opening this issue as an easy way to interested parties to track development progress of the next major APScheduler release (v4.0).
Terminology changes in v4.0
The old term of "Job", as it was, is gone, replaced by the following concepts which are closer to the terminology used by Celery:
Also, the term "executor" is now being changed to "worker".
Notice that the terminology may still change before the final release!
Planned major changes
v4.0 is a ground-up redesign that aims to fix all the long-standing flaws found in APScheduler over the years.
Checked boxes are changes that have already been implemented.
threshold
value forAndTrigger
(resolves issues with containedIntervalTrigger
instances)Potential extra features I would like to have:
OrTrigger
(Having the threshold also on OrTrigger? #453)You will notice that I have dropped a number of features from master. Some I may never add back to v4.0, even if requested, but do voice your wishes in this issue (and this issue only – I will summarily close such requests in new tickets). Others have been removed only temporarily to give me space for the redesign.
Features on the chopping block
Qt scheduler (difficult to test/maintain)Being on the chopping block does not mean the feature will be gone forever! It may return in subsequent minor release or even before the 4.0 final release if I deem it feasible to implement on top of the new architecture.
The text was updated successfully, but these errors were encountered: