Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eigenvalue solvers to add for comparisons #1

Open
lobpcg opened this issue May 25, 2021 · 8 comments
Open

Eigenvalue solvers to add for comparisons #1

lobpcg opened this issue May 25, 2021 · 8 comments

Comments

@lobpcg
Copy link

lobpcg commented May 25, 2021

https://en.wikipedia.org/wiki/LOBPCG#General_software_implementations multiple implementations, including many for GPUs
https://en.wikipedia.org/wiki/SLEPc tons of various solvers, most should work with GPU, see https://www.mcs.anl.gov/petsc/documentation/installation.html#CUDA

@rfeinman
Copy link
Owner

@lobpcg Thank you for pointing these out! I am installing some of the GPU solvers now on my linux machine and will report benchmarks ASAP.

@lobpcg
Copy link
Author

lobpcg commented May 25, 2021

Also, lobpcg convergence and total timing may greatly improve if running for several eigenvalues compared to your single case, even if only one eigenvalue is actually needed.

@rfeinman
Copy link
Owner

rfeinman commented May 26, 2021

@lobpcg Interesting - I wasn't aware of that.

In my simple test case (random, dense matrix with n=512) I'm not seeing a speedup from calling LOBPCG with several eigenvalues vs. one. And Lanczos-based ARPACK (eigsh) still appears to be considerably faster.

Code:

import torch
from scipy.sparse import linalg as splinalg

def eigen_solve(A, mode, largest=True, k=1, **kwargs):
    """solve for eigenpairs using a specified method"""
    tol = kwargs.pop('tol', 1e-4)
    if mode == 'torch_lobpcg':
        X = A.new_empty(A.size(0), k).normal_()
        return torch.lobpcg(A, X=X, largest=largest, tol=tol, **kwargs)
    elif mode == 'scipy_lobpcg':
        X = A.new_empty(A.size(0), k).normal_()
        return splinalg.lobpcg(A.numpy(), X.numpy(), largest=largest, tol=tol, **kwargs)
    elif mode == 'scipy_eigsh':
        which = 'LA' if largest else 'SA'
        return splinalg.eigsh(A.numpy(), k=k, which=which, tol=tol, **kwargs)
    else:
        raise ValueError

torch.manual_seed(1994)
A = sample_symmetric(512, dtype=torch.float64)
num_threads = torch.get_num_threads()
results = []
for k in [1, 3, 5, 7]:
    for mode in ['torch_lobpcg', 'scipy_lobpcg', 'scipy_eigsh']:
        res = benchmark.Timer(
            stmt="eigen_solve(A, mode, k=k)",
            setup="from __main__ import eigen_solve",
            globals=dict(A=A, mode=mode, k=k),
            num_threads=num_threads,
            label='eigensolvers',
            sub_label=mode,
            description='k={:d}'.format(k),
        ).blocked_autorange(min_run_time=1.)
        results.append(res)

compare = benchmark.Compare(results)
compare.print()

Result:

[------------------------- eigensolvers ------------------------]
                    |   k=1    |    k=3    |    k=5    |    k=7  
6 threads: ------------------------------------------------------
      torch_lobpcg  |  9032.6  |  28897.9  |  28547.6  |  26702.4
      scipy_lobpcg  |  7596.2  |   9626.7  |  10786.1  |  12516.1
      scipy_eigsh   |   852.6  |   2584.0  |   2860.6  |   3196.3

Times are in microseconds (us).

I will try again with a sparse matrix.

@rfeinman
Copy link
Owner

Here is the result with a very sparse symmetric matrix (using sparse BLAS). I am still not seeing the speedup from increasing the number of eigenpairs. Am I doing something wrong?

Code:

import numpy as np
from scipy import sparse
from scipy.sparse import linalg as splinalg

def sample_tridiag(n):
    d = np.random.randn(n)
    e = np.random.randn(n-1)
    return sparse.diags([d,e,e], [0,1,-1])

def eigen_solve(A, mode, largest=True, k=1, **kwargs):
    """solve for eigenpairs using a specified method"""
    tol = kwargs.pop('tol', 1e-4)
    if mode == 'scipy_lobpcg':
        X = np.random.randn(A.shape[0], k)
        return splinalg.lobpcg(A, X, largest=largest, tol=tol, **kwargs)
    elif mode == 'scipy_eigsh':
        which = 'LA' if largest else 'SA'
        return splinalg.eigsh(A, k=k, which=which, tol=tol, **kwargs)
    else:
        raise ValueError

np.random.seed(49)
A = sample_tridiag(512)
num_threads = torch.get_num_threads()
results = []
for k in [1, 3, 5, 7]:
    for mode in ['scipy_lobpcg', 'scipy_eigsh']:
        res = benchmark.Timer(
            stmt="eigen_solve(A, mode, k=k)",
            setup="from __main__ import eigen_solve",
            globals=dict(A=A, mode=mode, k=k),
            num_threads=num_threads,
            label='eigensolvers',
            sub_label=mode,
            description='k={:d}'.format(k),
        ).blocked_autorange(min_run_time=1.)
        results.append(res)

compare = benchmark.Compare(results)
compare.print()

Result:

[------------------------- eigensolvers ------------------------]
                    |   k=1    |    k=3    |    k=5    |    k=7  
6 threads: ------------------------------------------------------
      scipy_lobpcg  |  5963.7  |  12666.5  |  14763.2  |  16570.4
      scipy_eigsh   |   719.0  |   1463.4  |   2112.4  |   2298.3

Times are in microseconds (us).

@lobpcg
Copy link
Author

lobpcg commented May 26, 2021

These results look plausible to me.
Unfortunately, it is not implemented in lobpcg an ability to iterate several vectors but stop when a few of them reach the tolerance.

I have never tested myself lobpcg performance for small matrices trying to outperform the full eigen-decomposition.

Have in mind scipy_lobpcg is pure Python, while Arpack as well as your code and full eigen-decomposition are all compiled binary. If you want a fair comparison, may be try lobpcg implementation in RUST, which I remember is much faster compared to scipy_lobpcg or even my old C implementation; see rust-ndarray/ndarray-linalg#184 (comment)

@lobpcg
Copy link
Author

lobpcg commented May 26, 2021

Hard to say, since you do not report the number of iterations. Your code looks OK to me.

Here is the result with a very sparse symmetric matrix (using sparse BLAS). I am still not seeing the speedup from increasing the number of eigenpairs. Am I doing something wrong?

@rfeinman
Copy link
Owner

rfeinman commented May 26, 2021

Unfortunately the number of iterations is not returned by scipy lobpcg/eigsh (although it is in my own code). I will take a look at the compiled libraries that you mentioned, thanks.

I've been hoping to see compiled sparse eigensolvers in pytorch, but nothing yet :(. There appears to be a LOBPCG implementation in MAGMA, which is installed automatically with pytorch.

@lobpcg
Copy link
Author

lobpcg commented May 26, 2021

https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lobpcg.html returns rnorms: list of ndarray, optional The history of residual norms, if retResidualNormsHistory is True. Its length tells the number of iterations

You probably also need to set your own maxiter : int, optional Maximum number of iterations. The default is maxiter = 20. in LOBPCG. And there is a similar parameter in ARPACK.

Unfortunately the number of iterations is not returned by scipy lobpcg/eigsh (although it is in my own code). I will take a look at the compiled libraries that you mentioned, thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants