-
-
Notifications
You must be signed in to change notification settings - Fork 410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: reallocate instead of clearing and repopulating set of selected points #6895
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @DanGonite57, thanks for the PR and the thorough benchmarking! Unfortunately, I think this cannot be merged.
The reason we do update
here is because Selection
is an evented object, which users are expected to connect to. For example:
points_layer.selected_data.events.connect(print)
This PR will break that use, because as soon as the Selection
object changes, previous connection will stop receiving updates.
Thanks for the feedback and the example, I knew I'd missed something! I noticed that replacing It may also be desirable to use > hyperfine --warmup 1 .\bench.py
Benchmark 1: .\bench.py
Time (mean ± σ): 4.045 s ± 0.042 s [User: 1.686 s, System: 0.477 s]
Range (min … max): 4.006 s … 4.136 s 10 runs |
Awesome! And thanks for upstreaming to psygnal :) |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #6895 +/- ##
==========================================
- Coverage 92.47% 92.42% -0.06%
==========================================
Files 612 612
Lines 55159 55159
==========================================
- Hits 51009 50980 -29
- Misses 4150 4179 +29 ☔ View full report in Codecov by Sentry. |
self._selected_data.intersection_update(selected_data) | ||
self._selected_data.update(selected_data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to add a comment here to quickly explain why these 2 lines are like this! Other than that, looks good to me 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay, I have added an explanatory comment
One more small thing: we use the PR description in the commit message -- could you update the PR description? I believe it's out of date for the use of |
Given the dramatic speedup of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, actually, I think this should be closed. (a) til about hyperfine, it's so good!, but (b), with psygnal 0.11.1, which includes @DanGonite57's psygnal speedup, I get a ~15% slowdown with this PR:
main:
$ hyperfine --warmup 1 'python examples/dev/bench-points-selection.py'
Benchmark 1: python examples/dev/bench-points-selection.py
Time (mean ± σ): 1.946 s ± 0.015 s [User: 1.691 s, System: 0.240 s]
Range (min … max): 1.927 s … 1.979 s 10 runs
this PR:
$ hyperfine --warmup 1 'python examples/dev/bench-points-selection.py'
Benchmark 1: python examples/dev/bench-points-selection.py
Time (mean ± σ): 2.259 s ± 0.069 s [User: 1.983 s, System: 0.252 s]
Range (min … max): 2.191 s … 2.349 s 10 runs
@DanGonite57 do you agree?
The Qt benchmark run requested by PR #6895 (176d3e2 vs a9c25e6) has finished with status 'failure'. See the CI logs and artifacts for further details. |
The Qt benchmark run requested by PR #6895 (176d3e2 vs a9c25e6) has finished with status 'success'. See the CI logs and artifacts for further details. |
Yes, I think the slowdown is understandable given that E.g.: import numpy as np
from napari.layers.points import Points, _points_key_bindings as key_bindings
rng = np.random.default_rng(seed=0)
data = rng.random(size=(4, 2)) * 1000
layer = Points(data, size=1)
layer.mode = 'select'
layer._set_view_slice()
layer.selected_data.events.connect(print)
key_bindings.select_all_in_slice(layer) # Selection 1
layer.selected_data = set(layer._indices_view[1:4]) # Selection 2
layer.selected_data = set(layer._indices_view[0:3]) # Selection 3 clear: EmissionInfo(signal=<SignalInstance '_current' on <SignalGroup 'SelectionEvents' with 3 signals>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((0, 1, 2, 3), ())) # Selection 1 update
...SignalInstance '_current' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((), (0, 1, 2, 3))) # Selection 2 clear
...SignalInstance '_current' on <...>>, args=(1,))
...SignalInstance 'active' on <...>>, args=(1,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((1, 2, 3), ())) # Selection 2 update
...SignalInstance '_current' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(3,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((), (1, 2, 3))) # Selection 3 clear
...SignalInstance '_current' on <...>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(0,))
...SignalInstance 'active' on <...>>, args=(None,))
...SignalInstance 'items_changed' on <...>>, args=((0, 1, 2), ())) # Selection 3 update intersection_update: EmissionInfo(signal=<SignalInstance '_current' on <SignalGroup 'SelectionEvents' with 3 signals>>, args=(0,))
...SignalInstance 'active' on <...>, args=(0,))
...SignalInstance 'active' on <...>, args=(None,))
...SignalInstance 'items_changed' on <...>, args=((0, 1, 2, 3), ())) # Selection 1 update
...SignalInstance 'items_changed' on <...>, args=((), (0,))) # Selection 2 intersection
...SignalInstance 'items_changed' on <...>, args=((), (3,))) # Selection 3 intersection
...SignalInstance 'items_changed' on <...>, args=((0,), ())) # Selection 3 update I haven't personally used napari events, so I don't know which is desirable. I would be happy for this PR to be closed if this change is not necessary. p.s. in the case of intersection_update, is the fact the '_current' and 'active' selections don't change an issue? (For example, when point 0 is deselected it is still the active and _current point?) |
The Qt benchmark run requested by PR #6895 (27d4a9b vs 87e14c7) has finished with status 'success'. See the CI logs and artifacts for further details. |
I think you're right, the behaviour in this PR is better!
This does sound like an issue; should we maybe manually update the current/active to point to the last selected point in that case? and fire the event? |
References and relevant issues
Contributes towards #6746
Description
Selection.clear()
andSelection.update()
are both O(n) time operations, which means that clearing (and repopulating) an initially large selection set can take a significant amount of time. Reallocating as a new set skips needing to clear in O(n) time, achieving rough parity with.update()
by itself and relatively immediate deselection of large numbers of points in usage.bench.py
Before:
After:
I did note #5691 (comment), and I am not sure whether maintaining a reference to the same set happens to still be important anywhere. As far as I have seen, at least internally,
.selected_data
is always re-requested from the class object, so it shouldn't matter whether it is the same set object or a new one.Repeatedly reallocating also does not appear to have an impact on memory usage.
test_mem.py
cmd