Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multiprocessing + logging + tqdm progress bar flashing and incorrect progress #1331

Open
5 of 6 tasks
gpakosz opened this issue Jun 15, 2022 · 1 comment
Open
5 of 6 tasks

Comments

@gpakosz
Copy link

gpakosz commented Jun 15, 2022

  • I have marked all applicable categories:
    • exception-raising bug
    • visual output bug
  • I have visited the source website, and in particular
    read the known issues
  • I have searched through the issue tracker for duplicates
  • I have mentioned version numbers, operating system and
    environment, where applicable:
    import tqdm, sys
    print(tqdm.__version__, sys.version, sys.platform)
    4.64.0 3.7.3 (default, Jan 22 2021, 20:04:44)
    

Hello, I'm not sure I'm facing a visual output bug or if I'm doing something wrong when combining tqdm, multiprocessing and logging.

In the reproducer below, when I set VERBOSITY = 0, the log() calls from the worker processes output nothing and the progress bar is displayed correctly.

However, as soon as I log from the worker processes by setting VERBOSITY = 1, then here's what happening

  • While the progress bar is properly positioned and sticks to the bottom of the screen
  • Most of the time, tqdm displays a progress of 0% and only occasionally it flashes with the proper progress and percentage
  • At the very end, the last update outputs the proper progress: Processing...: 100%|██████████| 1000/1000 [00:31<00:00, 31.67foo/s] however it doesn't take the full width of the screen
import itertools
import logging
import multiprocessing
import random
import sys
import time

import tqdm
from tqdm.contrib import DummyTqdmFile

log: logging.Logger = logging.getLogger(__name__)


DEFAULT_FORMAT = "[%(asctime)s.%(msecs)06d][%(processName)s][%(threadName)s][%(levelname)s][%(module)s] %(message)s"
VERBOSITY = 1


def configure_logger(log: logging.Logger, verbosity: int, *, format: str = DEFAULT_FORMAT, dry_run: bool = False):
    """Configures the logger instance based on verbosity level"""

    # can't use force=true as it requires Python >= 3.8
    root = logging.getLogger()
    for handler in root.handlers:
        root.removeHandler(handler)

    if dry_run:
        format = format.replace(" %(message)s", "[DRY RUN] %(message)s")
    logging.basicConfig(format=format, datefmt="%Y-%m-%d %H:%M:%S", stream=DummyTqdmFile(sys.stdout))

    verbosity_to_loglevel = {0: logging.WARNING, 1: logging.INFO, 2: logging.DEBUG}
    log.setLevel(verbosity_to_loglevel[min(verbosity, 2)])

    if verbosity >= 4:
        logging.getLogger().setLevel(logging.INFO)
    if verbosity >= 5:
        logging.getLogger().setLevel(logging.DEBUG)

def by_n(iterable, n):
    """Iterate by chunks of n items"""
    return (tuple(filter(lambda x: x is not None, x)) for x in itertools.zip_longest(*[iter(iterable)] * n))


def _worker(batch):
    log.info("Let's go!")
    for i in batch:
        log.info("Processing item %d", i)
        time.sleep(random.uniform(0.1, 0.2))
    log.info("Done!")

    return len(batch)

if __name__ == "__main__":
    configure_logger(log, VERBOSITY)

    log.info("Let's go!")

    foos = list(range(1000))
    kwargs = {
        "desc": "Processing...",
        "total": len(foos),
        "leave": True,
        "mininterval": 1,
        "maxinterval": 5,
        "unit": "foo",
        "dynamic_ncols": True,
    }
    pbar = tqdm.tqdm(**kwargs)

    with multiprocessing.Pool(processes=5) as workers_pool:
        for batch_length in workers_pool.imap_unordered(
            _worker,
            by_n(foos, min(10 + int((len(foos) - 10) / 100), 1000)),
        ):
            pbar.update(batch_length)

    log.info("Done!")

Here is a screenshot while the script is running
image

And here is a screenshot at the very end after it has terminated
image

@mjpieters
Copy link

I'm pretty sure this is the same issue as #1496. Could you give my proposed fix (PR #1502) a spin? You can install the changes locally with:

pip install git+https://github.com/tqdm/tqdm.git@refs/pull/1502/merge

Running your reproducer locally, I see regular progress updates:

[2023-08-25 19:38:21.000915][MainProcess][MainThread][INFO][test] Let's go!
Processing...:  11%|████████▏                                                               | 114/1000 [00:05<00:39, 22.36foo/s]

and when it completes the final output looks like this:

[2023-08-25 19:36:59.000274][MainProcess][MainThread][INFO][test] Let's go!
[2023-08-25 19:37:31.000325][MainProcess][MainThread][INFO][test] Done!                                                         
Processing...: 100%|██████████| 1000/1000 [00:32<00:00, 31.22foo/s]                                                             

Is that what you were looking for?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants