Skip to content

Latest commit

 

History

History
979 lines (728 loc) · 35 KB

.readme.rst

File metadata and controls

979 lines (728 loc) · 35 KB

Logo

tqdm

PyPI-Versions PyPI-Status Conda-Forge-Status Docker Snapcraft

Build-Status Coverage-Status Branch-Coverage-Status Codacy-Grade Libraries-Rank PyPI-Downloads

DOI LICENCE OpenHub-Status binder-demo notebook-demo awesome-python

tqdm means "progress" in Arabic (taqadum, تقدّم) and is an abbreviation for "I love you so much" in Spanish (te quiero demasiado).

Instantly make your loops show a smart progress meter - just wrap any iterable with tqdm(iterable), and you're done!

from tqdm import tqdm
for i in tqdm(range(10000)):
    ...

76%|████████████████████████████         | 7568/10000 [00:33<00:10, 229.00it/s]

trange(N) can be also used as a convenient shortcut for tqdm(xrange(N)).

Screenshot
Video Slides

It can also be executed as a module with pipes:

$ seq 9999999 | tqdm --bytes | wc -l
75.2MB [00:00, 217MB/s]
9999999
$ 7z a -bd -r backup.7z docs/ | grep Compressing | \
    tqdm --total $(find docs/ -type f | wc -l) --unit files >> backup.log
100%|███████████████████████████████▉| 8014/8014 [01:37<00:00, 82.29files/s]

Overhead is low -- about 60ns per iteration (80ns with tqdm.gui), and is unit tested against performance regression. By comparison, the well-established ProgressBar has an 800ns/iter overhead.

In addition to its low overhead, tqdm uses smart algorithms to predict the remaining time and to skip unnecessary iteration displays, which allows for a negligible overhead in most cases.

tqdm works on any platform (Linux, Windows, Mac, FreeBSD, NetBSD, Solaris/SunOS), in any console or in a GUI, and is also friendly with IPython/Jupyter notebooks.

tqdm does not require any dependencies (not even curses!), just Python and an environment supporting carriage return \r and line feed \n control characters.


PyPI-Status PyPI-Downloads Libraries-Dependents

pip install tqdm

GitHub-Status GitHub-Stars GitHub-Commits GitHub-Forks GitHub-Updated

Pull and install in the current directory:

pip install -e git+https://github.com/tqdm/tqdm.git@master#egg=tqdm

Conda-Forge-Status

conda install -c conda-forge tqdm

Snapcraft

snap install tqdm

Docker

docker pull tqdm/tqdm
docker run -i --rm tqdm/tqdm --help

There are other (unofficial) places where tqdm may be downloaded, particularly for CLI use:

Repology

The list of all changes is available either on GitHub's Releases: GitHub-Status, on the wiki, on the website, or on crawlers such as allmychanges.com.

tqdm is very versatile and can be used in a number of ways. The three main ones are given below.

Wrap tqdm() around any iterable:

from tqdm import tqdm
import time

text = ""
for char in tqdm(["a", "b", "c", "d"]):
    time.sleep(0.25)
    text = text + char

trange(i) is a special optimised instance of tqdm(range(i)):

for i in trange(100):
    time.sleep(0.01)

Instantiation outside of the loop allows for manual control over tqdm():

pbar = tqdm(["a", "b", "c", "d"])
for char in pbar:
    time.sleep(0.25)
    pbar.set_description("Processing %s" % char)

Manual control on tqdm() updates by using a with statement:

with tqdm(total=100) as pbar:
    for i in range(10):
        time.sleep(0.1)
        pbar.update(10)

If the optional variable total (or an iterable with len()) is provided, predictive stats are displayed.

with is also optional (you can just assign tqdm() to a variable, but in this case don't forget to del or close() at the end:

pbar = tqdm(total=100)
for i in range(10):
    time.sleep(0.1)
    pbar.update(10)
pbar.close()

Perhaps the most wonderful use of tqdm is in a script or on the command line. Simply inserting tqdm (or python -m tqdm) between pipes will pass through all stdin to stdout while printing progress to stderr.

The example below demonstrated counting the number of lines in all Python files in the current directory, with timing information included.

$ time find . -name '*.py' -type f -exec cat \{} \; | wc -l
857365

real    0m3.458s
user    0m0.274s
sys     0m3.325s

$ time find . -name '*.py' -type f -exec cat \{} \; | tqdm | wc -l
857366it [00:03, 246471.31it/s]
857365

real    0m3.585s
user    0m0.862s
sys     0m3.358s

Note that the usual arguments for tqdm can also be specified.

$ find . -name '*.py' -type f -exec cat \{} \; |
    tqdm --unit loc --unit_scale --total 857366 >> /dev/null
100%|███████████████████████████████████| 857K/857K [00:04<00:00, 246Kloc/s]

Backing up a large directory?

$ 7z a -bd -r backup.7z docs/ | grep Compressing |
    tqdm --total $(find docs/ -type f | wc -l) --unit files >> backup.log
100%|███████████████████████████████▉| 8014/8014 [01:37<00:00, 82.29files/s]

GitHub-Issues

The most common issues relate to excessive output on multiple lines, instead of a neat one-line progress bar.

  • Consoles in general: require support for carriage return (CR, \r).
  • Nested progress bars:
    • Consoles in general: require support for moving cursors up to the previous line. For example, IDLE, ConEmu and PyCharm (also here, here, and here) lack full support.
    • Windows: additionally may require the Python module colorama to ensure nested bars stay within their respective lines.
  • Unicode:
    • Environments which report that they support unicode will have solid smooth progressbars. The fallback is an `ascii-only bar.
    • Windows consoles often only partially support unicode and thus often require explicit ascii=True (also here). This is due to either normal-width unicode characters being incorrectly displayed as "wide", or some unicode characters not rendering.
  • Wrapping enumerated iterables: use enumerate(tqdm(...)) instead of tqdm(enumerate(...)). The same applies to numpy.ndenumerate. This is because enumerate functions tend to hide the length of iterables. tqdm does not.
  • Wrapping zipped iterables has similar issues due to internal optimisations. tqdm(zip(a, b)) should be replaced with zip(tqdm(a), b) or even zip(tqdm(a), tqdm(b)).
  • Hanging pipes in python2: when using tqdm on the CLI, you may need to use Python 3.5+ for correct buffering.

If you come across any other difficulties, browse and file GitHub-Issues.

PyPI-Versions README-Hits (Since 19 May 2016)

class tqdm():
  """{DOC_tqdm}"""

  def __init__(self, iterable=None, desc=None, total=None, leave=True,
               file=None, ncols=None, mininterval=0.1,
               maxinterval=10.0, miniters=None, ascii=None, disable=False,
               unit='it', unit_scale=False, dynamic_ncols=False,
               smoothing=0.3, bar_format=None, initial=0, position=None,
               postfix=None, unit_divisor=1000):

{DOC_tqdm.tqdm.__init__.Parameters} Extra CLI Options ~~~~~~~~~~~~~~~~~

{DOC_tqdm.cli.CLI_EXTRA_DOC} Returns ~~~~~~~

{DOC_tqdm.tqdm.__init__.Returns} .. code:: python

class tqdm():
def update(self, n=1):
"""{DOC_tqdm.tqdm.update}"""
def close(self):
"""{DOC_tqdm.tqdm.close}"""
def clear(self, nomove=False):
"""{DOC_tqdm.tqdm.clear}"""
def refresh(self):
"""{DOC_tqdm.tqdm.refresh}"""
def unpause(self):
"""{DOC_tqdm.tqdm.unpause}"""
def reset(self, total=None):
"""{DOC_tqdm.tqdm.reset}"""
def set_description(self, desc=None, refresh=True):
"""{DOC_tqdm.tqdm.set_description}"""
def set_postfix(self, ordered_dict=None, refresh=True, **kwargs):
"""{DOC_tqdm.tqdm.set_postfix}"""

@classmethod def write(cls, s, file=sys.stdout, end="n"):

"""{DOC_tqdm.tqdm.write}"""

@property def format_dict(self):

"""{DOC_tqdm.tqdm.format_dict}"""
def display(self, msg=None, pos=None):
"""{DOC_tqdm.tqdm.display}"""
def trange(*args, **kwargs):
""" A shortcut for tqdm(xrange(*args), **kwargs). On Python3+ range is used instead of xrange. """
class tqdm.gui.tqdm(tqdm.tqdm):
"""Experimental GUI version"""
def tqdm.gui.trange(*args, **kwargs):
"""Experimental GUI version of trange"""
class tqdm.notebook.tqdm(tqdm.tqdm):
"""Experimental IPython/Jupyter Notebook widget"""
def tqdm.notebook.trange(*args, **kwargs):
"""Experimental IPython/Jupyter Notebook widget version of trange"""

Custom information can be displayed and updated dynamically on tqdm bars with the desc and postfix arguments:

from tqdm import trange
from random import random, randint
from time import sleep

with trange(10) as t:
    for i in t:
        # Description will be displayed on the left
        t.set_description('GEN %i' % i)
        # Postfix will be displayed on the right,
        # formatted automatically based on argument's datatype
        t.set_postfix(loss=random(), gen=randint(1,999), str='h',
                      lst=[1, 2])
        sleep(0.1)

with tqdm(total=10, bar_format="{postfix[0]} {postfix[1][value]:>8.2g}",
          postfix=["Batch", dict(value=0)]) as t:
    for i in range(10):
        sleep(0.1)
        t.postfix[1]["value"] = i / 2
        t.update()

Points to remember when using {postfix[...]} in the bar_format string:

  • postfix also needs to be passed as an initial argument in a compatible format, and
  • postfix will be auto-converted to a string if it is a dict-like object. To prevent this behaviour, insert an extra item into the dictionary where the key is not a string.

Additional bar_format parameters may also be defined by overriding format_dict, and the bar itself may be modified using ascii:

from tqdm import tqdm
class TqdmExtraFormat(tqdm):
    """Provides a `total_time` format parameter"""
    @property
    def format_dict(self):
        d = super(TqdmExtraFormat, self).format_dict
        total_time = d["elapsed"] * (d["total"] or 0) / max(d["n"], 1)
        d.update(total_time=self.format_interval(total_time) + " in total")
        return d

for i in TqdmExtraFormat(
      range(10), ascii=" .oO0",
      bar_format="{total_time}: {percentage:.0f}%|{bar}{r_bar}"):
    pass
00:01 in total: 40%|000o     | 4/10 [00:00<00:00,  9.96it/s]

Note that {bar} also supports a format specifier [width][type].

  • width
    • unspecified (default): automatic to fill ncols
    • int >= 0: fixed width overriding ncols logic
    • int < 0: subtract from the automatic default
  • type
    • a: ascii (ascii=True override)
    • u: unicode (ascii=False override)
    • b: blank (ascii=" " override)

This means a fixed bar with right-justified text may be created by using: bar_format="{l_bar}{bar:10}|{bar:-10b}right-justified"

tqdm supports nested progress bars. Here's an example:

from tqdm import trange
from time import sleep

for i in trange(4, desc='1st loop'):
    for j in trange(5, desc='2nd loop'):
        for k in trange(50, desc='3nd loop', leave=False):
            sleep(0.01)

On Windows colorama will be used if available to keep nested bars on their respective lines.

For manual control over positioning (e.g. for multi-processing use), you may specify position=n where n=0 for the outermost bar, n=1 for the next, and so on. However, it's best to check if tqdm can work without manual position first.

from time import sleep
from tqdm import trange, tqdm
from multiprocessing import Pool, freeze_support

L = list(range(9))

def progresser(n):
    interval = 0.001 / (n + 2)
    total = 5000
    text = "#{}, est. {:<04.2}s".format(n, interval * total)
    for _ in trange(total, desc=text, position=n):
        sleep(interval)

if __name__ == '__main__':
    freeze_support()  # for Windows support
    p = Pool(initializer=tqdm.set_lock, initargs=(tqdm.get_lock(),))
    p.map(progresser, L)

Note that in Python 3, tqdm.write is thread-safe:

from time import sleep
from tqdm import tqdm, trange
from concurrent.futures import ThreadPoolExecutor

L = list(range(9))

def progresser(n):
    interval = 0.001 / (n + 2)
    total = 5000
    text = "#{}, est. {:<04.2}s".format(n, interval * total)
    for _ in trange(total, desc=text):
        sleep(interval)
    if n == 6:
        tqdm.write("n == 6 completed.")
        tqdm.write("`tqdm.write()` is thread-safe in py3!")

if __name__ == '__main__':
    with ThreadPoolExecutor() as p:
        p.map(progresser, L)

tqdm can easily support callbacks/hooks and manual updates. Here's an example with urllib:

urllib.urlretrieve documentation

[...]
If present, the hook function will be called once
on establishment of the network connection and once after each block read
thereafter. The hook will be passed three arguments; a count of blocks
transferred so far, a block size in bytes, and the total size of the file.
[...]
import urllib, os
from tqdm import tqdm

class TqdmUpTo(tqdm):
    """Provides `update_to(n)` which uses `tqdm.update(delta_n)`."""
    def update_to(self, b=1, bsize=1, tsize=None):
        """
        b  : int, optional
            Number of blocks transferred so far [default: 1].
        bsize  : int, optional
            Size of each block (in tqdm units) [default: 1].
        tsize  : int, optional
            Total size (in tqdm units). If [default: None] remains unchanged.
        """
        if tsize is not None:
            self.total = tsize
        self.update(b * bsize - self.n)  # will also set self.n = b * bsize

eg_link = "https://caspersci.uk.to/matryoshka.zip"
with TqdmUpTo(unit='B', unit_scale=True, miniters=1,
              desc=eg_link.split('/')[-1]) as t:  # all optional kwargs
    urllib.urlretrieve(eg_link, filename=os.devnull,
                       reporthook=t.update_to, data=None)

Inspired by twine#242. Functional alternative in examples/tqdm_wget.py.

It is recommend to use miniters=1 whenever there is potentially large differences in iteration speed (e.g. downloading a file over a patchy connection).

Due to popular demand we've added support for pandas -- here's an example for DataFrame.progress_apply and DataFrameGroupBy.progress_apply:

import pandas as pd
import numpy as np
from tqdm import tqdm

df = pd.DataFrame(np.random.randint(0, 100, (100000, 6)))

# Register `pandas.progress_apply` and `pandas.Series.map_apply` with `tqdm`
# (can use `tqdm.gui.tqdm`, `tqdm.notebook.tqdm`, optional kwargs, etc.)
tqdm.pandas(desc="my bar!")

# Now you can use `progress_apply` instead of `apply`
# and `progress_map` instead of `map`
df.progress_apply(lambda x: x**2)
# can also groupby:
# df.groupby(0).progress_apply(lambda x: x**2)

In case you're interested in how this works (and how to modify it for your own callbacks), see the examples folder or import the module and run help().

IPython/Jupyter is supported via the tqdm.notebook submodule:

from tqdm.notebook import trange, tqdm
from time import sleep

for i in trange(3, desc='1st loop'):
    for j in tqdm(range(100), desc='2nd loop'):
        sleep(0.01)

In addition to tqdm features, the submodule provides a native Jupyter widget (compatible with IPython v1-v4 and Jupyter), fully working nested bars and colour hints (blue: normal, green: completed, red: error/interrupt, light blue: no ETA); as demonstrated below.

Screenshot-Jupyter1 Screenshot-Jupyter2 Screenshot-Jupyter3

It is also possible to let tqdm automatically choose between console or notebook versions by using the autonotebook submodule:

from tqdm.autonotebook import tqdm
tqdm.pandas()

Note that this will issue a TqdmExperimentalWarning if run in a notebook since it is not meant to be possible to distinguish between jupyter notebook and jupyter console. Use auto instead of autonotebook to suppress this warning.

tqdm may be inherited from to create custom callbacks (as with the TqdmUpTo example above) or for custom frontends (e.g. GUIs such as notebook or plotting packages). In the latter case:

  1. def __init__() to call super().__init__(..., gui=True) to disable terminal status_printer creation.
  2. Redefine: close(), clear(), display().

Consider overloading display() to use e.g. self.frontend(**self.format_dict) instead of self.sp(repr(self)).

tqdm/notebook.py and tqdm/gui.py submodules are examples of inheritance which don't (yet) strictly conform to the above recommendation.

You can use a tqdm as a meter which is not monotonically increasing. This could be because n decreases (e.g. a CPU usage monitor) or total changes.

One example would be recursively searching for files. The total is the number of objects found so far, while n is the number of those objects which are files (rather than folders):

from tqdm import tqdm
import os.path

def find_files_recursively(path, show_progress=True):
    files = []
    # total=1 assumes `path` is a file
    t = tqdm(total=1, unit="file", disable=not show_progress)
    if not os.path.exists(path):
        raise IOError("Cannot find:" + path)

    def append_found_file(f):
        files.append(f)
        t.update()

    def list_found_dir(path):
        """returns os.listdir(path) assuming os.path.isdir(path)"""
        listing = os.listdir(path)
        # subtract 1 since a "file" we found was actually this directory
        t.total += len(listing) - 1
        # fancy way to give info without forcing a refresh
        t.set_postfix(dir=path[-10:], refresh=False)
        t.update(0)  # may trigger a refresh
        return listing

    def recursively_search(path):
        if os.path.isdir(path):
            for f in list_found_dir(path):
                recursively_search(os.path.join(path, f))
        else:
            append_found_file(path)

    recursively_search(path)
    t.set_postfix(dir=path)
    t.close()
    return files

Using update(0) is a handy way to let tqdm decide when to trigger a display refresh to avoid console spamming.

This is a work in progress (see #737).

Since tqdm uses a simple printing mechanism to display progress bars, you should not write any message in the terminal using print() while a progressbar is open.

To write messages in the terminal without any collision with tqdm bar display, a .write() method is provided:

from tqdm import tqdm, trange
from time import sleep

bar = trange(10)
for i in bar:
    # Print using tqdm class method .write()
    sleep(0.1)
    if not (i % 3):
        tqdm.write("Done task %i" % i)
    # Can also use bar.write()

By default, this will print to standard output sys.stdout. but you can specify any file-like object using the file argument. For example, this can be used to redirect the messages writing to a log file or class.

If using a library that can print messages to the console, editing the library by replacing print() with tqdm.write() may not be desirable. In that case, redirecting sys.stdout to tqdm.write() is an option.

To redirect sys.stdout, create a file-like class that will write any input string to tqdm.write(), and supply the arguments file=sys.stdout, dynamic_ncols=True.

A reusable canonical example is given below:

from time import sleep
import contextlib
import sys
from tqdm import tqdm

class DummyTqdmFile(object):
    """Dummy file-like that will write to tqdm"""
    file = None
    def __init__(self, file):
        self.file = file

    def write(self, x):
        # Avoid print() second call (useless \n)
        if len(x.rstrip()) > 0:
            tqdm.write(x, file=self.file)

    def flush(self):
        return getattr(self.file, "flush", lambda: None)()

@contextlib.contextmanager
def std_out_err_redirect_tqdm():
    orig_out_err = sys.stdout, sys.stderr
    try:
        sys.stdout, sys.stderr = map(DummyTqdmFile, orig_out_err)
        yield orig_out_err[0]
    # Relay exceptions
    except Exception as exc:
        raise exc
    # Always restore sys.stdout/err if necessary
    finally:
        sys.stdout, sys.stderr = orig_out_err

def some_fun(i):
    print("Fee, fi, fo,".split()[i])

# Redirect stdout to tqdm.write() (don't forget the `as save_stdout`)
with std_out_err_redirect_tqdm() as orig_stdout:
    # tqdm needs the original stdout
    # and dynamic_ncols=True to autodetect console width
    for i in tqdm(range(3), file=orig_stdout, dynamic_ncols=True):
        sleep(.5)
        some_fun(i)

# After the `with`, printing is restored
print("Done!")

tqdm implements a few tricks to to increase efficiency and reduce overhead.

  • Avoid unnecessary frequent bar refreshing: mininterval defines how long to wait between each refresh. tqdm always gets updated in the background, but it will display only every mininterval.
  • Reduce number of calls to check system clock/time.
  • mininterval is more intuitive to configure than miniters. A clever adjustment system dynamic_miniters will automatically adjust miniters to the amount of iterations that fit into time mininterval. Essentially, tqdm will check if it's time to print without actually checking time. This behaviour can be still be bypassed by manually setting miniters.

However, consider a case with a combination of fast and slow iterations. After a few fast iterations, dynamic_miniters will set miniters to a large number. When iteration rate subsequently slows, miniters will remain large and thus reduce display update frequency. To address this:

  • maxinterval defines the maximum time between display refreshes. A concurrent monitoring thread checks for overdue updates and forces one where necessary.

The monitoring thread should not have a noticeable overhead, and guarantees updates at least every 10 seconds by default. This value can be directly changed by setting the monitor_interval of any tqdm instance (i.e. t = tqdm.tqdm(...); t.monitor_interval = 2). The monitor thread may be disabled application-wide by setting tqdm.tqdm.monitor_interval = 0 before instantiation of any tqdm bar.

GitHub-Commits GitHub-Issues GitHub-PRs OpenHub-Status GitHub-Contributions CII Best Practices

All source code is hosted on GitHub. Contributions are welcome.

See the CONTRIBUTING file for more information.

Developers who have made significant contributions, ranked by LoC (surviving lines of code, git fame -wMC), are:

Name ID LoC Notes
Casper da Costa-Luis casperdcl ~3/4 primary maintainer Gift-Casper
Stephen Larroque lrq3000 ~10% team member
Kyle Altendorf altendky ~2%  
Guangshuo Chen chengs ~1%  
Matthew Stevens mjstevens777 ~1%  
Noam Yorav-Raphael noamraph ~1% original author
Hadrien Mary hadim ~1% team member
Mikhail Korobov kmike ~1% team member

sourcerer-0 sourcerer-1 sourcerer-2 sourcerer-3 sourcerer-4 sourcerer-5 sourcerer-7

A list is available on this wiki page.

Open Source (OSI approved): LICENCE

Citation information: DOI (publication), DOI-code (code)

README-Hits (Since 19 May 2016)