Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes issue #159 and greatly speeds up iteration of PMap in Python3 #243

Merged
merged 3 commits into from Feb 13, 2022

Commits on Feb 10, 2022

  1. Makes the PMap keys, values, and items methods return in O(1) time

    Adds a new class `PMapView` to the `pyrsistent._pmap` namespace. This class is
    intended to be private and essentially mimics the built-in `dict_values` and
    `dict_items` pseudo-types. Under the hood they just keep a reference to the
    `PMap` object and use the `iteritems` and `itervalues` methods; this means that
    creating a view can be run in constant time insteead of the current `O(n)` time.
    
    The `keys` method of `PMap` does not use the `PMapView` class because `PSet` is
    aleady a `PMapView`-type class for `PMap` keys. I couldn't think of a reason not
    to just use `PSet`: it's the logical type for a set of keys of a persistent map,
    and it can just be passed a `PMap` to create such a set in constant time.
    
    The tests were also updated to reflect the reality that `keys` now returns a
    `PSet` instead of a `PVector` and that `values` and `items` return `PMapView`s.
    
    As a benchmark, the following code:
    
    ```python
    from timeit import timeit
    
    def make_setup(mapsize):
        return ('from pyrsistent import pmap, pvector; '
                'm = pmap({k:k for k in range(%d)})' % (mapsize,))
    
    tests = [('len(m.keys())',   'len(pvector(m.iterkeys()))'),
             ('len(m.values())', 'len(pvector(m.itervalues()))'),
             ('len(m.items())',  'len(pvector(m.iteritems()))')]
    
    for mapsize in [10, 100, 1000]:
        print("Map size:", str(mapsize))
        setupstr = make_setup(mapsize)
        for test_pair in tests:
            print('=' * 60)
            print('%-40s' % test_pair[0], '%6f us' % timeit(test_pair[0], setup=setupstr))
            print('%-40s' % test_pair[1], '%6f us' % timeit(test_pair[1], setup=setupstr))
            print('-' * 60)
        print('')
    ```
    
    produces the following output on my desktop:
    
    ```
    Map size: 10
    ============================================================
    len(m.keys())                            1.330579 us
    len(pvector(m.iterkeys()))               1.562125 us
    ------------------------------------------------------------
    ============================================================
    len(m.values())                          0.653008 us
    len(pvector(m.itervalues()))             1.584405 us
    ------------------------------------------------------------
    ============================================================
    len(m.items())                           0.653144 us
    len(pvector(m.iteritems()))              1.159099 us
    ------------------------------------------------------------
    
    Map size: 100
    ============================================================
    len(m.keys())                            1.350190 us
    len(pvector(m.iterkeys()))               11.702080 us
    ------------------------------------------------------------
    ============================================================
    len(m.values())                          0.650723 us
    len(pvector(m.itervalues()))             11.739465 us
    ------------------------------------------------------------
    ============================================================
    len(m.items())                           0.653189 us
    len(pvector(m.iteritems()))              8.951215 us
    ------------------------------------------------------------
    
    Map size: 1000
    ============================================================
    len(m.keys())                            1.379429 us
    len(pvector(m.iterkeys()))               114.202673 us
    ------------------------------------------------------------
    ============================================================
    len(m.values())                          0.679722 us
    len(pvector(m.itervalues()))             113.874931 us
    ------------------------------------------------------------
    ============================================================
    len(m.items())                           0.680877 us
    len(pvector(m.iteritems()))              87.645472 us
    ------------------------------------------------------------
    ```
    noahbenson committed Feb 10, 2022
    Configuration menu
    Copy the full SHA
    e2ab7c3 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    4ec6e5a View commit details
    Browse the repository at this point in the history

Commits on Feb 11, 2022

  1. Configuration menu
    Copy the full SHA
    5cebec7 View commit details
    Browse the repository at this point in the history