Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: spatial: Rotation.apply is slow - replace np.einsum with @? #20532

Open
cosama opened this issue Apr 19, 2024 · 5 comments
Open

BUG: spatial: Rotation.apply is slow - replace np.einsum with @? #20532

cosama opened this issue Apr 19, 2024 · 5 comments
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.spatial

Comments

@cosama
Copy link

cosama commented Apr 19, 2024

Describe your issue.

I just did a bit of benchmarking and scipy.spatial.transform.Rotation.apply is quite slow. It turned out the reason for this is that numpy.einsum is quite a bit slower than direct matrix multiplication using the @ operator. This seems to be a known issue: https://stackoverflow.com/questions/20149201/why-is-numpys-einsum-slower-than-numpys-built-in-functions.

Replacing the np.einsum('ijk,ik->ij', matrix[None, :, :], vector) call with ((matrix[None, :, :] @ vector.T))[0].T, should fix this. By the way turning on and off optimization with the optimize=True flag to numpy.einsum did not change anything for me. I can submit a pull request, just wanted to see if others can reproduce this.

Here is the faulty line: https://github.com/scipy/scipy/blob/main/scipy/spatial/transform/_rotation.pyx#L2537. This would require some thought about the inverse kwarg, but self.inverse().as_matrix(), would be one way of addressing that.

Reproducing Code Example

import sys
import numpy as np
import scipy

print("numpy", np.__version__)
print("scipy", scipy.__version__)
print("python", sys.version)

q = np.random.default_rng().random(4, dtype=float)
q /= np.linalg.norm(q)

vector = np.random.default_rng().random((100000, 3), dtype=float)
rot = scipy.spatial.transform.Rotation.from_quat(q)
matrix = rot.as_matrix()

%timeit rot.apply(vector)
rv = rot.apply(vector)

%timeit np.einsum('ijk,ik->ij', matrix[None, :, :], vector, optimize=True)
mv1 = np.einsum('ijk,ik->ij', matrix[None, :, :], vector, optimize=True)

%timeit (matrix @ vector.T).T
mv2 = (matrix @ vector.T).T

%timeit ((matrix[None, :, :] @ vector.T))[0].T
mv3 = ((matrix[None, :, :] @ vector.T))[0].T

print(np.allclose(rv, mv1))
print(np.allclose(rv, mv2))
print(np.allclose(rv, mv3))

Message

numpy 1.24.4
scipy 1.10.1
python 3.8.10 (default, Nov 22 2023, 10:22:35) 
[GCC 9.4.0]
1.66 ms ± 58.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
1.65 ms ± 14.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
101 µs ± 12.8 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
109 µs ± 15.9 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
True
True
True

SciPy/NumPy/Python version and system information

numpy 1.24.4
scipy 1.10.1
python 3.8.10 (default, Nov 22 2023, 10:22:35) 
[GCC 9.4.0]
@cosama cosama added the defect A clear bug or issue that prevents SciPy from being installed or used as expected label Apr 19, 2024
@lucascolley lucascolley changed the title BUG: Rotation.apply is slow BUG: spatial: Rotation.apply is slow - replace np.einsum with @? Apr 19, 2024
@scottshambaugh
Copy link
Contributor

scottshambaugh commented May 6, 2024

Can confirm that the matrix multiplication approaches are faster on my machine

@cosama updated comment with timing:

numpy 1.26.4
scipy 1.13.0
python 3.10.12 (main, Jul  5 2023, 18:54:27) [GCC 11.2.0]
2.01 ms ± 335 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.93 ms ± 183 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
227 µs ± 41.5 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
250 µs ± 80.4 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
True
True
True

@lucascolley
Copy link
Member

feel free to submit a PR @cosama !

@cosama
Copy link
Author

cosama commented May 6, 2024

@scottshambaugh @lucascolley Thanks for the feedback: I had recently a quick look into this and I am not sure if this actually hits some catch somewhere.

def create_xyz():
    return np.random.default_rng().random((100000, 3), dtype=float) * 100

%timeit create_xyz()

def mat():
    xyz = create_xyz()
    return (matrix @ xyz.T).T

%timeit mat()

def mat():
    xyz = create_xyz()
    return rot.apply(xyz)

%timeit mat()

no speedup here:

1.16 ms ± 28.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
2.14 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.93 ms ± 86.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@scottshambaugh
Copy link
Contributor

scottshambaugh commented May 6, 2024

I think that the similar timing in your comment is largely in the create_xyz() function? If you subtract off that 1.16 ms from your later times, the difference in the two methods is more dramatic.

Here's what I get if I pull out create_xyz():

xyz = np.random.default_rng().random((100000, 3), dtype=float) * 100

def mat1():
    return (matrix @ xyz.T).T
%timeit mat1()

def mat2():
    return rot.apply(xyz)
%timeit mat2()
136 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
1.8 ms ± 140 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

@cosama
Copy link
Author

cosama commented May 6, 2024

@scottshambaugh Thanks for the reply. I just can't wrap my head around this, have been playing with it over the last 30min again.
If you guys think it is worth it I am happy to submit a PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.spatial
Projects
None yet
Development

No branches or pull requests

4 participants
@cosama @scottshambaugh @lucascolley and others