Eigenvalue solvers to add for comparisons #1

lobpcg · 2021-05-25T17:23:20Z

https://en.wikipedia.org/wiki/LOBPCG#General_software_implementations multiple implementations, including many for GPUs
https://en.wikipedia.org/wiki/SLEPc tons of various solvers, most should work with GPU, see https://www.mcs.anl.gov/petsc/documentation/installation.html#CUDA

rfeinman · 2021-05-25T21:12:41Z

@lobpcg Thank you for pointing these out! I am installing some of the GPU solvers now on my linux machine and will report benchmarks ASAP.

lobpcg · 2021-05-25T23:33:36Z

Also, lobpcg convergence and total timing may greatly improve if running for several eigenvalues compared to your single case, even if only one eigenvalue is actually needed.

rfeinman · 2021-05-26T14:55:51Z

@lobpcg Interesting - I wasn't aware of that.

In my simple test case (random, dense matrix with n=512) I'm not seeing a speedup from calling LOBPCG with several eigenvalues vs. one. And Lanczos-based ARPACK (eigsh) still appears to be considerably faster.

Code:

import torch
from scipy.sparse import linalg as splinalg

def eigen_solve(A, mode, largest=True, k=1, **kwargs):
    """solve for eigenpairs using a specified method"""
    tol = kwargs.pop('tol', 1e-4)
    if mode == 'torch_lobpcg':
        X = A.new_empty(A.size(0), k).normal_()
        return torch.lobpcg(A, X=X, largest=largest, tol=tol, **kwargs)
    elif mode == 'scipy_lobpcg':
        X = A.new_empty(A.size(0), k).normal_()
        return splinalg.lobpcg(A.numpy(), X.numpy(), largest=largest, tol=tol, **kwargs)
    elif mode == 'scipy_eigsh':
        which = 'LA' if largest else 'SA'
        return splinalg.eigsh(A.numpy(), k=k, which=which, tol=tol, **kwargs)
    else:
        raise ValueError

torch.manual_seed(1994)
A = sample_symmetric(512, dtype=torch.float64)
num_threads = torch.get_num_threads()
results = []
for k in [1, 3, 5, 7]:
    for mode in ['torch_lobpcg', 'scipy_lobpcg', 'scipy_eigsh']:
        res = benchmark.Timer(
            stmt="eigen_solve(A, mode, k=k)",
            setup="from __main__ import eigen_solve",
            globals=dict(A=A, mode=mode, k=k),
            num_threads=num_threads,
            label='eigensolvers',
            sub_label=mode,
            description='k={:d}'.format(k),
        ).blocked_autorange(min_run_time=1.)
        results.append(res)

compare = benchmark.Compare(results)
compare.print()

Result:

[------------------------- eigensolvers ------------------------]
                    |   k=1    |    k=3    |    k=5    |    k=7  
6 threads: ------------------------------------------------------
      torch_lobpcg  |  9032.6  |  28897.9  |  28547.6  |  26702.4
      scipy_lobpcg  |  7596.2  |   9626.7  |  10786.1  |  12516.1
      scipy_eigsh   |   852.6  |   2584.0  |   2860.6  |   3196.3

Times are in microseconds (us).

I will try again with a sparse matrix.

rfeinman · 2021-05-26T15:09:53Z

Here is the result with a very sparse symmetric matrix (using sparse BLAS). I am still not seeing the speedup from increasing the number of eigenpairs. Am I doing something wrong?

Code:

import numpy as np
from scipy import sparse
from scipy.sparse import linalg as splinalg

def sample_tridiag(n):
    d = np.random.randn(n)
    e = np.random.randn(n-1)
    return sparse.diags([d,e,e], [0,1,-1])

def eigen_solve(A, mode, largest=True, k=1, **kwargs):
    """solve for eigenpairs using a specified method"""
    tol = kwargs.pop('tol', 1e-4)
    if mode == 'scipy_lobpcg':
        X = np.random.randn(A.shape[0], k)
        return splinalg.lobpcg(A, X, largest=largest, tol=tol, **kwargs)
    elif mode == 'scipy_eigsh':
        which = 'LA' if largest else 'SA'
        return splinalg.eigsh(A, k=k, which=which, tol=tol, **kwargs)
    else:
        raise ValueError

np.random.seed(49)
A = sample_tridiag(512)
num_threads = torch.get_num_threads()
results = []
for k in [1, 3, 5, 7]:
    for mode in ['scipy_lobpcg', 'scipy_eigsh']:
        res = benchmark.Timer(
            stmt="eigen_solve(A, mode, k=k)",
            setup="from __main__ import eigen_solve",
            globals=dict(A=A, mode=mode, k=k),
            num_threads=num_threads,
            label='eigensolvers',
            sub_label=mode,
            description='k={:d}'.format(k),
        ).blocked_autorange(min_run_time=1.)
        results.append(res)

compare = benchmark.Compare(results)
compare.print()

Result:

[------------------------- eigensolvers ------------------------]
                    |   k=1    |    k=3    |    k=5    |    k=7  
6 threads: ------------------------------------------------------
      scipy_lobpcg  |  5963.7  |  12666.5  |  14763.2  |  16570.4
      scipy_eigsh   |   719.0  |   1463.4  |   2112.4  |   2298.3

Times are in microseconds (us).

lobpcg · 2021-05-26T15:13:22Z

These results look plausible to me.
Unfortunately, it is not implemented in lobpcg an ability to iterate several vectors but stop when a few of them reach the tolerance.

I have never tested myself lobpcg performance for small matrices trying to outperform the full eigen-decomposition.

Have in mind scipy_lobpcg is pure Python, while Arpack as well as your code and full eigen-decomposition are all compiled binary. If you want a fair comparison, may be try lobpcg implementation in RUST, which I remember is much faster compared to scipy_lobpcg or even my old C implementation; see rust-ndarray/ndarray-linalg#184 (comment)

lobpcg · 2021-05-26T15:25:32Z

Hard to say, since you do not report the number of iterations. Your code looks OK to me.

Here is the result with a very sparse symmetric matrix (using sparse BLAS). I am still not seeing the speedup from increasing the number of eigenpairs. Am I doing something wrong?

rfeinman · 2021-05-26T16:48:26Z

Unfortunately the number of iterations is not returned by scipy lobpcg/eigsh (although it is in my own code). I will take a look at the compiled libraries that you mentioned, thanks.

I've been hoping to see compiled sparse eigensolvers in pytorch, but nothing yet :(. There appears to be a LOBPCG implementation in MAGMA, which is installed automatically with pytorch.

lobpcg · 2021-05-26T19:00:03Z

https://docs.scipy.org/doc/scipy/reference/generated/scipy.sparse.linalg.lobpcg.html returns rnorms: list of ndarray, optional The history of residual norms, if retResidualNormsHistory is True. Its length tells the number of iterations

You probably also need to set your own maxiter : int, optional Maximum number of iterations. The default is maxiter = 20. in LOBPCG. And there is a similar parameter in ARPACK.

Unfortunately the number of iterations is not returned by scipy lobpcg/eigsh (although it is in my own code). I will take a look at the compiled libraries that you mentioned, thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eigenvalue solvers to add for comparisons #1

Eigenvalue solvers to add for comparisons #1

lobpcg commented May 25, 2021

rfeinman commented May 25, 2021

lobpcg commented May 25, 2021

rfeinman commented May 26, 2021 •

edited

rfeinman commented May 26, 2021

lobpcg commented May 26, 2021 •

edited

lobpcg commented May 26, 2021

rfeinman commented May 26, 2021 •

edited

lobpcg commented May 26, 2021

Eigenvalue solvers to add for comparisons #1

Eigenvalue solvers to add for comparisons #1

Comments

lobpcg commented May 25, 2021

rfeinman commented May 25, 2021

lobpcg commented May 25, 2021

rfeinman commented May 26, 2021 • edited

rfeinman commented May 26, 2021

lobpcg commented May 26, 2021 • edited

lobpcg commented May 26, 2021

rfeinman commented May 26, 2021 • edited

lobpcg commented May 26, 2021

rfeinman commented May 26, 2021 •

edited

lobpcg commented May 26, 2021 •

edited

rfeinman commented May 26, 2021 •

edited