Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Optimize performance of np.atleast_1d #26130

Merged
merged 4 commits into from May 6, 2024
Merged

Conversation

eendebakpt
Copy link
Contributor

In this PR we improve the performance of np.atleast_1d for the single argument case (and slightly improve for the other cases).
The np.atleast_1d is used in many other methods as part of the input pre-processing (Github search for np.atleast_1d).

Benchmark

atleast_1d(1): Mean +- std dev: [main] 489 ns +- 16 ns -> [pr] 422 ns +- 9 ns: 1.16x faster
atleast_1d(x): Mean +- std dev: [main] 224 ns +- 9 ns -> [pr] 159 ns +- 7 ns: 1.41x faster
atleast_1d(1, x ): Mean +- std dev: [main] 592 ns +- 26 ns -> [pr] 605 ns +- 24 ns: 1.02x slower
polyadd: Mean +- std dev: [main] 2.50 us +- 0.11 us -> [pr] 2.26 us +- 0.08 us: 1.10x faster

Geometric mean: 1.15x faster
Benchmark script
import pyperf

runner = pyperf.Runner()

setup="""
import numpy as np
from numpy.polynomial import Polynomial

x = np.array([2, 3, 4])
y = np.array([2, 3])

p = Polynomial([1, 2, 3.])
"""   
runner.timeit(name=f"atleast_1d(1)", stmt=f"np.atleast_1d(1)", setup=setup)
runner.timeit(name=f"atleast_1d(x)", stmt=f"np.atleast_1d(x)", setup=setup)
runner.timeit(name=f"atleast_1d(1, x )", stmt=f"np.atleast_1d(1, x )", setup=setup)
runner.timeit(name=f"polyadd", stmt=f"np.polyadd(x, y)", setup=setup)

@ngoldbaum ngoldbaum added the triage review Issue/PR to be discussed at the next triage meeting label Apr 17, 2024
@ngoldbaum
Copy link
Member

We looked at this at the triage meeting and think this needs asv benchmarks. It would be good to see that the slowdown for the list of arrays case is not significant. It would also be good to add a comment explaining that the single-element case is done first for performance reasons.

@ngoldbaum ngoldbaum added triaged Issue/PR that was discussed in a triage meeting and removed triage review Issue/PR to be discussed at the next triage meeting labels May 1, 2024
@eendebakpt
Copy link
Contributor Author

We looked at this at the triage meeting and think this needs asv benchmarks. It would be good to see that the slowdown for the list of arrays case is not significant. It would also be good to add a comment explaining that the single-element case is done first for performance reasons.

I added 3 benchmark functions. On my system there is no slowdown visible for the list of arrays case, but the asv runs are a bit noisy so the variation between main and this PR is typically smaller than the variations between different runs. @ngoldbaum Could you run the benchmarks independently?

@ngoldbaum
Copy link
Member

Here's what I see running the new benchmarks on my laptop:

| Change   | Before [b2960879] <main>   | After [f6cb2a92] <atleast_1d>   |   Ratio | Benchmark (Parameter)                                      |
|----------|----------------------------|---------------------------------|---------|------------------------------------------------------------|
| -        | 206±10ns                   | 167±5ns                         |    0.81 | bench_shape_base.AtLeast1D.time_atleast_1d_single_argument |

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

So on my machine, about 20% faster and any decrease in performance on the list-of-arrays case is less than 5%. Seems worth it to me!

@ngoldbaum ngoldbaum merged commit 0310e17 into numpy:main May 6, 2024
65 checks passed
@ngoldbaum
Copy link
Member

Thanks @eendebakpt!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
01 - Enhancement triaged Issue/PR that was discussed in a triage meeting
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants