Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

control pool's tasks startup #151

Closed
wants to merge 8 commits into from
Closed

Conversation

dlax
Copy link
Contributor

@dlax dlax commented Nov 14, 2021

(edit)

  • Add ConnectionPool.open() and AsyncConnectionPool.open()
  • Make it possible to re-open a closed pool.
  • Add an open parameter to pool initilization to control whether workers should be started at initialization or not.
  • Possibly open pool (start worker tasks) when entering context manager.

This is a refactoring (mostly moving code around), and also a behavior change; that would help moving to anyio as discussed in #29 (comment)

Copy link
Member

@dvarrazzo dvarrazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, it seems a good refactoring to have.

Could you please also add an entry in the change log? docs/news_pool.rst. I'd say we would ship this in psycopg_pool 3.1.

# _close should be the last property to be set in the state
# to avoid warning on __del__ in case __init__ fails.
self._closed = False
self._closed = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that now the pool can be restarted? (accidentally or not so?)

Copy link
Contributor Author

@dlax dlax Nov 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say so, as expected from a context manager.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, not necessarily. Once a connection or cursor are closed they can't be reopened.

If this behaviour is expected we should add tests for it and verify that the main functionalities (the background workers) are up again as expected. If it's not a feature we want to support (and it might make our life easier) then we should explicitly forbid it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gone with the restart option as it's easily achievable and it makes state handling a little more consistent in my opinion.

psycopg_pool/psycopg_pool/pool.py Outdated Show resolved Hide resolved
psycopg_pool/psycopg_pool/pool.py Outdated Show resolved Hide resolved
psycopg_pool/psycopg_pool/pool.py Outdated Show resolved Hide resolved
tests/pool/test_pool.py Outdated Show resolved Hide resolved
tests/pool/test_pool.py Outdated Show resolved Hide resolved
tests/pool/test_pool.py Show resolved Hide resolved
tests/pool/test_pool.py Outdated Show resolved Hide resolved
@dvarrazzo
Copy link
Member

Uhm... thinking better about it though, now the pool cannot be used as a global object anymore, which is a pretty advertised use case.

Even thinking about FastAPI, I wouldn't know exactly where to make the pool live if we tie its operation to the with block, because we cannot tie it to the application start/stop events.

@dlax
Copy link
Contributor Author

dlax commented Nov 14, 2021

Uhm... thinking better about it though, now the pool cannot be used as a global object anymore, which is a pretty advertised use case.

How was this suppose to work previously? If this was a global object, then it seems to me that it could not be used multiple times through the context manager because, at __exit__(), close() is called and tasks are thus stopped and there is then no way to restart them.
Perhaps we can add a start() method that would start tasks as done here by __enter__() to accommodate this use case?

Even thinking about FastAPI, I wouldn't know exactly where to make the pool live if we tie its operation to the with block, because we cannot tie it to the application start/stop events.

I don't know much about FastAPI. This docs refers to starlette's doc at https://www.starlette.io/events/ which mentions a "lifespan" handler; perhaps the pool could to be opened there?

@dvarrazzo
Copy link
Member

dvarrazzo commented Nov 14, 2021

Uhm... thinking better about it though, now the pool cannot be used as a global object anymore, which is a pretty advertised use case.

How was this suppose to work previously? If this was a global object, then it seems to me that it could not be used multiple times through the context manager because, at __exit__(), close() is called and tasks are thus stopped and there is then no way to restart them. Perhaps we can add a start() method that would start tasks as done here by __enter__() to accommodate this use case?

It depends how your program is organised. You might have your main() function under control, where you create your resources, call the rest of the program from there (passing the pool as an argument or leaving it to access from a global name), and terminate the pool after the called function is finished. But if you write your code in a framework you don't necessary have this luxury.

It's a situation similar to the connections, and why connect() doesn't return an entered connection. Sometimes it's handy that it is tied to a lexical block, sometimes not (for instance... when it's in a pool).

I am, probably, the person who has used psycopg 3 the most in the world so far 🤓 and I have had pretty much two use cases for the pool so far:

  • FastAPI programs I'm writing for a customer
  • The psycopg pool test suite

I have introduced the context manager because of all the try...except writing the tests, which was getting tedious... but the only professional use of the pool has been without the context manager.

Even thinking about FastAPI, I wouldn't know exactly where to make the pool live if we tie its operation to the with block, because we cannot tie it to the application start/stop events.

I don't know much about FastAPI. This docs refers to starlette's doc at https://www.starlette.io/events/ which mentions a "lifespan" handler; perhaps the pool could to be opened there?

FastAPI might be probably used that way, as it's based on Starlette, but I think we are still removing an important use case. Other contexts might not be as well designed as Starlette is.

If even we go in that direction, it's a breaking change, so it should be released in a psycopg_pool version 4.0.

Why do you mention that for anyio it would be better to manage this work at aenter time? I'm not familiar with it.

Maybe it would be worth a refactoring moving those operation in an open() method (complementary of close()) and make calling it "optionalish"? (it would be the default in pool 3.x blocking/asyncio, but I assume anyio would have a separate pool subclass, which might differ in behaviour).

The optionality might be exposed in the init method, adding an open_now parameter which, for 3.0, would default to True, and it might become False in the future, if it makes more sense so).

@dlax
Copy link
Contributor Author

dlax commented Nov 14, 2021

Why do you mention that for anyio it would be better to manage this work at aenter time?

Because we'd need to open a task group before being able to create tasks. Opening happens with the async context manager, and it does not seem possible (nor desirable) to call await task_group.__aenter__() in __init__().

@dvarrazzo
Copy link
Member

dvarrazzo commented Nov 14, 2021

Why do you mention that for anyio it would be better to manage this work at aenter time?

Because we'd need to open a task group before being able to create tasks. Opening happens with the async context manager, and it does not seem possible (nor desirable) to call await task_group.__aenter__() in __init__().

Here we are faced with two different "bads":

  • bad classes which do to much at init time,or
  • bad classes which are not completely working at init time...

I understand the case you mention, and I don't really like that the pool, as it is, starts tasks on init. However tying them solely to the context makes hard to use the pool when the context is out of control.

How about:

  • we make _start_workers() public, as [async] open()
  • by default we call open() from __init__, to avoid to change behaviour. In the AsyncPool we spawn a task for it
  • in __aenter__/__aexit__ we call open()/close()
  • we document open() and add a parameter to __init__ to call open on start.
  • we reserve, for the future, the possibility to change the default of the parameter (and of course AnyioPool/TrioPool may have a different default).

This way on anyio you have the choice of either entering an async with or to call await open() when desired and manage "manually" the pool life cycle, not tied to the context.

@dlax
Copy link
Contributor Author

dlax commented Nov 15, 2021

The plan you suggest sounds good. I'll rework the PR accordingly.

By the way, I think we can actually defer the addition of anyio support for psycopg_pool while progressing on the "core" until we find a good design for the pool.

@dlax dlax changed the title handle pool's tasks start/stop at context manager level control pool's tasks startup Nov 15, 2021
@dlax
Copy link
Contributor Author

dlax commented Nov 15, 2021

I'll rework the PR accordingly.

It's actually a complete rewrite :)
Done now.

@@ -117,6 +118,9 @@ def closed(self) -> bool:
"""`!True` if the pool is closed."""
return self._closed

def open(self) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would expect open to be async on the async pool, and no implementation in the base class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed making open() async is indeed desired, but it would make it hard (if ever possible) to call it in __init__().
To recap, the original idea to run it as task create_task(self.open()) is not enough because this will make the pool non-functional after object creation (thus a breaking change). Trying to wait on the task to finish (in __ini__()) seems also difficult: I've tried using a threading.Event but this does not play well with asyncio, possibly because the Event locks the thread in which the asyncio loop is running, for instance:

import asyncio
import threading

async def unlock(ev):
    ev.set()

async def main():
    ev = threading.Event()
    asyncio.create_task(unlock(ev))
    ev.wait()  # block here
    assert ev.is_set()

if __name__ == "__main__":
    asyncio.run(main())

So it seems we're left with introducing a breaking change: make open() async and not opening the pool in __init__().
I'll open another PR implementing this; then this one can be closed.

Copy link
Member

@dvarrazzo dvarrazzo Nov 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this problem: the problem is just that the thread goes to smash into the wait without a chance to run the unlock task. Adding await asyncio.sleep(0) before ev.wait() solves the problem. It doesn't even need a run_in_executor() - I thought it would.

So it seems we can call the moral equivalent of await self.open() from a sync task: as such we could implement an async open method, and opening in the aenter method, add an open=True parameter to the init, and pretty much release all this in 3.1, deferring to 4.0 only defaulting the open to False, if it makes sense somewhere. And of course a different class, TrioPool, may have whatever default it wants, no backwards compatibility to maintain.

At the moment I'm a bit lost between #151 and #155. How do you suggest to proceed?

Cheers!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked this problem: the problem is just that the thread goes to smash into the wait without a chance to run the unlock task. Adding await asyncio.sleep(0) before ev.wait() solves the problem. It doesn't even need a run_in_executor() - I thought it would.

Well, yes. But the case I posted is a bit too simplistic because we cannot actually emit await asyncio.sleep(0) in AsyncConnectionPool.__init__() because it's not async.

So it seems we can call the moral equivalent of await self.open() from a sync task: as such we could implement an async open method, and opening in the aenter method, add an open=True parameter to the init, and pretty much release all this in 3.1, deferring to 4.0 only defaulting the open to False, if it makes sense somewhere. And of course a different class, TrioPool, may have whatever default it wants, no backwards compatibility to maintain.

At the moment I'm a bit lost between #151 and #155. How do you suggest to proceed?

This attempt (#151) seemed like a dead-end, so I opened #155 with a breaking change. Unless I missed something the await asyncio.sleep(0) would not work; so it's still a dead-end.

Alternatively, if we want to leave the dependency on anyio optional (the approach I think we agreed on for the "core" part of psycopg), then we'll have to implement another class AnyIOConnnectionPool and we can avoid breaking AsyncConnectionPool (at the cost of having different behaviors w.r.t. tasks startup at object initialization). That's a third way.

So:

psycopg_pool/psycopg_pool/pool.py Outdated Show resolved Hide resolved
dlax and others added 2 commits November 16, 2021 11:58
There is no threading involved here, only async tasks. Adjust comments
and docstrings to avoid confusion.
This method is responsible for setting the '_closed' attribute, which
hence now defaults to True in the base class, along with the
_sched_runner attribute, which is reset to None in close().
@dlax
Copy link
Contributor Author

dlax commented Nov 16, 2021

Rebased on master and disallow pool re-opening as discussed.

This way, the pools may be re-opened after a close, even if we'll
disallow this later on.
The method is symmetrical with open().
In particular, we take advantage of this method in
ConnectionPool.__del__() to remove some duplication.
There seems to be no use case for re-opening a closed pool, so disable
this now (until we find a real use-case).
@dlax
Copy link
Contributor Author

dlax commented Nov 18, 2021

Replaced by #155.

@dlax dlax closed this Nov 18, 2021
dvarrazzo added a commit that referenced this pull request Jan 3, 2022
This makes it symmetrical with close(). However it doesn't really do any
async work as it's awkward to call it from init. Something we might do,
if that will be needed, could be to start the scheduler only and use it
to schedule immediately a call to an async _open().

In the future, an anyio-based pool might instead disallow open=True on
init.

See #151 for some discussion
about the topic.
@dvarrazzo dvarrazzo mentioned this pull request Jan 3, 2022
dvarrazzo added a commit that referenced this pull request Jan 3, 2022
This makes it symmetrical with close(). However it doesn't really do any
async work as it's awkward to call it from init. Something we might do,
if that will be needed, could be to start the scheduler only and use it
to schedule immediately a call to an async _open().

In the future, an anyio-based pool might instead disallow open=True on
init.

See #151 for some discussion
about the topic.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants