Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: reallocate instead of clearing and repopulating set of selected points #6895

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

DanGonite57
Copy link
Contributor

@DanGonite57 DanGonite57 commented May 6, 2024

References and relevant issues

Contributes towards #6746

Description

Selection.clear() and Selection.update() are both O(n) time operations, which means that clearing (and repopulating) an initially large selection set can take a significant amount of time. Reallocating as a new set skips needing to clear in O(n) time, achieving rough parity with .update() by itself and relatively immediate deselection of large numbers of points in usage.

bench.py

import numpy as np

from napari.layers.points import Points, _points_key_bindings as key_bindings

rng = np.random.default_rng(seed=0)
data = rng.random(size=(100_000, 2)) * 1000
layer = Points(data, size=1)
layer.mode = 'select'
layer._set_view_slice()

key_bindings.select_all_in_slice(layer)
key_bindings.select_all_in_slice(layer)

Before:

> hyperfine --warmup 1 .\bench.py
Benchmark 1: .\bench.py
  Time (mean ± σ):      6.546 s ±  0.110 s    [User: 4.267 s, System: 0.381 s]
  Range (min … max):    6.453 s …  6.836 s    10 runs

After:

> hyperfine --warmup 1 .\bench.py
Benchmark 1: .\bench.py
  Time (mean ± σ):      3.760 s ±  0.073 s    [User: 1.563 s, System: 0.402 s]
  Range (min … max):    3.708 s …  3.952 s    10 runs

I did note #5691 (comment), and I am not sure whether maintaining a reference to the same set happens to still be important anywhere. As far as I have seen, at least internally, .selected_data is always re-requested from the class object, so it shouldn't matter whether it is the same set object or a new one.

Repeatedly reallocating also does not appear to have an impact on memory usage.

test_mem.py

import numpy as np

from napari.layers.points import Points, _points_key_bindings as key_bindings

rng = np.random.default_rng(seed=0)
data = rng.random(size=(100_000, 2)) * 1000
layer = Points(data, size=1)
layer.mode = 'select'
layer._set_view_slice()

for _ in range(10):
    key_bindings.select_all_in_slice(layer)
    key_bindings.select_all_in_slice(layer)

cmd

> pip install memory_profiler
> mprof run test_mem.py
> mprof plot

Copy link
Contributor

@brisvag brisvag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DanGonite57, thanks for the PR and the thorough benchmarking! Unfortunately, I think this cannot be merged.

The reason we do update here is because Selection is an evented object, which users are expected to connect to. For example:

points_layer.selected_data.events.connect(print)

This PR will break that use, because as soon as the Selection object changes, previous connection will stop receiving updates.

@DanGonite57
Copy link
Contributor Author

Thanks for the feedback and the example, I knew I'd missed something!

I noticed that replacing .clear() with .intersection_update(selected_data) results in a similar speed increase, which I guess is due to the fact that the intersection_update implementation in psygnal relies on discard, rather than pop of the underlying generic set ABC which is apparently extremely slow. I will initially try to improve this upstream, but this may be an option if that is not possible.

It may also be desirable to use intersection_update regardless, if sets of selections often have large overlaps?

> hyperfine --warmup 1 .\bench.py
Benchmark 1: .\bench.py
  Time (mean ± σ):      4.045 s ±  0.042 s    [User: 1.686 s, System: 0.477 s]
  Range (min … max):    4.006 s …  4.136 s    10 runs

@brisvag
Copy link
Contributor

brisvag commented May 7, 2024

Awesome! And thanks for upstreaming to psygnal :) intersection_update sounds fitting actually; can you change to that and then we checke whether the tests are passing?

Copy link

codecov bot commented May 9, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.42%. Comparing base (87e14c7) to head (b1f653a).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #6895      +/-   ##
==========================================
- Coverage   92.47%   92.42%   -0.06%     
==========================================
  Files         612      612              
  Lines       55159    55159              
==========================================
- Hits        51009    50980      -29     
- Misses       4150     4179      +29     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Comment on lines +1482 to +1483
self._selected_data.intersection_update(selected_data)
self._selected_data.update(selected_data)
Copy link
Contributor

@brisvag brisvag May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to add a comment here to quickly explain why these 2 lines are like this! Other than that, looks good to me 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, I have added an explanatory comment

@brisvag brisvag added the performance Relates to performance label May 17, 2024
@brisvag brisvag added this to the 0.5.0 milestone May 17, 2024
@brisvag brisvag added the ready to merge Last chance for comments! Will be merged in ~24h label May 17, 2024
@psobolewskiPhD
Copy link
Member

One more small thing: we use the PR description in the commit message -- could you update the PR description? I believe it's out of date for the use of intersection_update

@jni
Copy link
Member

jni commented May 27, 2024

Given the dramatic speedup of clear in psygnal, is it still true that intersection_update is preferable?

Copy link
Member

@jni jni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, actually, I think this should be closed. (a) til about hyperfine, it's so good!, but (b), with psygnal 0.11.1, which includes @DanGonite57's psygnal speedup, I get a ~15% slowdown with this PR:

main:

$ hyperfine --warmup 1 'python examples/dev/bench-points-selection.py'
Benchmark 1: python examples/dev/bench-points-selection.py
  Time (mean ± σ):      1.946 s ±  0.015 s    [User: 1.691 s, System: 0.240 s]
  Range (min … max):    1.927 s …  1.979 s    10 runs

this PR:

$ hyperfine --warmup 1 'python examples/dev/bench-points-selection.py'
Benchmark 1: python examples/dev/bench-points-selection.py
  Time (mean ± σ):      2.259 s ±  0.069 s    [User: 1.983 s, System: 0.252 s]
  Range (min … max):    2.191 s …  2.349 s    10 runs

@DanGonite57 do you agree?

@jni jni removed the ready to merge Last chance for comments! Will be merged in ~24h label May 27, 2024
@Czaki Czaki added run-benchmarks Add this label to trigger a full benchmark run in a PR labels May 27, 2024
Copy link
Contributor

The Qt benchmark run requested by PR #6895 (176d3e2 vs a9c25e6) has finished with status 'failure'. See the CI logs and artifacts for further details.
The non-Qt benchmark run requested by PR #6895 (176d3e2 vs a9c25e6) has finished with status 'success'. See the CI logs and artifacts for further details.

@github-actions github-actions bot removed the run-benchmarks Add this label to trigger a full benchmark run in a PR label May 27, 2024
Copy link
Contributor

The Qt benchmark run requested by PR #6895 (176d3e2 vs a9c25e6) has finished with status 'success'. See the CI logs and artifacts for further details.
The non-Qt benchmark run requested by PR #6895 (176d3e2 vs a9c25e6) has finished with status 'success'. See the CI logs and artifacts for further details.

@Czaki Czaki added the run-benchmarks Add this label to trigger a full benchmark run in a PR label May 27, 2024
@DanGonite57
Copy link
Contributor Author

Yes, I think the slowdown is understandable given that intersection_update operates on each item individually, while clear handles it all in bulk. The main question now is whether the behaviour of intersection_update is more sensible than clear, namely in that clear-update emits an event for the removal of all points, and then all selections readded, whereas intersection_update-update will only emit an event for removal of points that are no longer selected, and addition of newly selected points.

E.g.:

import numpy as np

from napari.layers.points import Points, _points_key_bindings as key_bindings

rng = np.random.default_rng(seed=0)
data = rng.random(size=(4, 2)) * 1000
layer = Points(data, size=1)
layer.mode = 'select'
layer._set_view_slice()
layer.selected_data.events.connect(print)

key_bindings.select_all_in_slice(layer)  # Selection 1
layer.selected_data = set(layer._indices_view[1:4])  # Selection 2
layer.selected_data = set(layer._indices_view[0:3])  # Selection 3

clear:

EmissionInfo(signal=<SignalInstance '_current' on <SignalGroup 'SelectionEvents' with 3 signals>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((0, 1, 2, 3), ()))  # Selection 1 update
...SignalInstance '_current' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((), (0, 1, 2, 3)))  # Selection 2 clear
...SignalInstance '_current' on <...>>, args=(1,))
...SignalInstance 'active' on <...>>, args=(1,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((1, 2, 3), ()))  # Selection 2 update
...SignalInstance '_current' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((), (1, 2, 3)))  # Selection 3 clear
...SignalInstance '_current' on <...>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((0, 1, 2), ()))  # Selection 3 update

intersection_update:

EmissionInfo(signal=<SignalInstance '_current' on <SignalGroup 'SelectionEvents' with 3 signals>>, args=(0,))
...SignalInstance 'active' on <...>, args=(0,))
...SignalInstance 'active' on <...>, args=(None,))
...SignalInstance 'items_changed' on <...>, args=((0, 1, 2, 3), ()))  # Selection 1 update
...SignalInstance 'items_changed' on <...>, args=((), (0,)))  # Selection 2 intersection
...SignalInstance 'items_changed' on <...>, args=((), (3,)))  # Selection 3 intersection
...SignalInstance 'items_changed' on <...>, args=((0,), ()))  # Selection 3 update

I haven't personally used napari events, so I don't know which is desirable. I would be happy for this PR to be closed if this change is not necessary.

p.s. in the case of intersection_update, is the fact the '_current' and 'active' selections don't change an issue? (For example, when point 0 is deselected it is still the active and _current point?)

Copy link
Contributor

The Qt benchmark run requested by PR #6895 (27d4a9b vs 87e14c7) has finished with status 'success'. See the CI logs and artifacts for further details.
The non-Qt benchmark run requested by PR #6895 (27d4a9b vs 87e14c7) has finished with status 'success'. See the CI logs and artifacts for further details.

@github-actions github-actions bot removed the run-benchmarks Add this label to trigger a full benchmark run in a PR label May 27, 2024
@brisvag
Copy link
Contributor

brisvag commented May 28, 2024

I think you're right, the behaviour in this PR is better!

p.s. in the case of intersection_update, is the fact the '_current' and 'active' selections don't change an issue? (For example, when point 0 is deselected it is still the active and _current point?)

This does sound like an issue; should we maybe manually update the current/active to point to the last selected point in that case? and fire the event?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Relates to performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants