Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expose the unfinished tasks variable #34

Merged
merged 3 commits into from Feb 22, 2017
Merged

Conversation

richardbaronpenman
Copy link
Contributor

I use queue.empty() and queue._parent._unfinished_tasks == 0 to check whether a queue is complete and this pull request exposes the _unfinished_tasks variable.

My use case is 2 queues that can send work to each other until both are complete - is there a better way to achieve this? Note that I'm not trying to join() since a thread should wake up if there is more work to do.

@asvetlov
Copy link
Member

Sorry, I dont understand your need clean but my guts feel that exposing a new property is not necessary.

Both queue.Queue and asyncio.Queue doesn't have such property.

Let's imagine you have two chained queue.Queue instances. How do you solve your need in this case? Maybe two queues is not enough for getting rid of dead locks and you need additional synchronization primitives like semaphores or conditional variables?

@richardbaronpenman
Copy link
Contributor Author

For example if had a web crawler with a queue for URL's to download and a queue for responses to scrape, then I could use a third variable to check whether all work is complete:

async def crawler(download_q, response_q, work_q):
    while True:
        if download_q.empty():
            if response_q.empty() and work_q.empty():
                break
            else:
                asyncio.sleep(delay)
        else:
            await work_q.put(1)
            url = await download_q.get()
            response = await download(url)
            await response_q.put(response)
            await work_q.get()


async def scraper(download_q, response_q, work_q):
    while True:
        if response_q.empty():
            if download_q.empty() and work_q.empty():
                break
            else:
                asyncio.sleep(delay)
        else:
            await work_q.put(1)
            response = await response_q.get()
            for url in crawl(response):
                await download_q.put(url)
            await work_q.get()

With janus I have been avoiding this additional variable by using task_done() and checking ._parent._unfinished_tasks, which works fine, however would prefer not to rely on this private variable.

@asvetlov
Copy link
Member

asyncio.sleep() in your code is a sign of bad design.
The code shouldn't sleep in cycle but wait for explicit event.
Learn how asyncio.Condition works, use https://github.com/python/cpython/blob/master/Lib/queue.py for inspiration.
In your case easier to create to condition variables (have URL for download and have a page content to analyze) in pair with two conventional lists instead of using queues.

Or take a look on crawler example (it's built using slightly different approach than your code).

@asvetlov asvetlov closed this Feb 19, 2017
@richardbaronpenman
Copy link
Contributor Author

Thanks for sleep tip - have changed to using wait_for.

The 500lines crawler example only uses a single download queue, so avoids the multiple queue problem.
Mine is reading data from a database in a separate thread and then using janus to communicate between the async and threaded parts.

I noticed the queue implementation linked makes unfinished_tasks a public variable:
https://github.com/python/cpython/blob/master/Lib/queue.py#L48

@asvetlov asvetlov reopened this Feb 20, 2017
@asvetlov
Copy link
Member

Good point.
I'm ok with your proposal but please make tests for covering added properties

@richardbaronpenman
Copy link
Contributor Author

Thanks - see latest commit

@asvetlov
Copy link
Member

asvetlov commented Feb 21, 2017

tests/test_mixed.py:59:1: W293 blank line contains whitespace

Please fix it.
The test looks correct

@asvetlov asvetlov merged commit 316eb58 into aio-libs:master Feb 22, 2017
@asvetlov
Copy link
Member

Thanks!

@asvetlov
Copy link
Member

janus 0.3.0 released

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants