Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats.gaussian_kde: replace use of inv_cov in logpdf #16987

Merged
merged 2 commits into from
Sep 20, 2022

Conversation

mdhaber
Copy link
Contributor

@mdhaber mdhaber commented Sep 8, 2022

Reference issue

gh-16692

What does this implement/fix?

gh-16692 used Cholesky decomposition to avoid the inversion of the covariance matrix in gaussian_kde.pdf.
We wanted to wait for gh-15493 to finish before implementing that change in gaussian_kde.logpdf.
Now that gh-15493 is done, this makes the change to gaussian_kde.logpdf.

It also simplifies what we did in gh-16692. In that, we found a way to use Cholesky decomposition to compute $L^T x$, where $L L^T = \Sigma^{-1}$ (that is, $L$ is a Cholesky factor of the precision matrix / inverse covariance matrix).
Ultimately, we don't need $L^T x$; it was just one way to get to $x^T \Sigma^{-1} x$. There is a simpler way to get that while still avoiding the matrix inversion: $x^T \Sigma^{-1} x = y^T y$, where $C y = x$ and $CC^T = \Sigma$ (that is, $C$ is a Cholesky factor of the original covariance matrix rather than the precision matrix). The short calculation is shown here.

@steppi I think someone else can review this since it's simpler than gh-16692, but I thought you might find it interesting that those permutations weren't necessary to get the end result.

@mdhaber mdhaber added scipy.stats enhancement A new feature or improvement labels Sep 8, 2022
@mdhaber
Copy link
Contributor Author

mdhaber commented Sep 8, 2022

Failures appear to be unrelated.

@steppi
Copy link
Contributor

steppi commented Sep 16, 2022

@steppi I think someone else can review this since it's simpler than gh-16692, but I thought you might find it interesting that those permutations weren't necessary to get the end result.

Oh right, that makes perfect sense. If you start with the cholesky decomposition of the covariance matrix then you end up with something that isn’t quite a cholesky decomposition of the precision matrix because the left factor is upper triangular instead of lower. This factorization is still workable though because you can just call solve_triangular with lower=True in the end.

Copy link
Contributor

@steppi steppi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

@mdhaber
Copy link
Contributor Author

mdhaber commented Sep 16, 2022

Thanks @steppi.

@jjerphan does this look good to you from a Cython perspective? To avoid too many changes at once, I didn't release the GIL. That can be a separate PR, if desired.

@jjerphan
Copy link
Contributor

@jjerphan does this look good to you from a Cython perspective? To avoid too many changes at once, I didn't release the GIL. That can be a separate PR, if desired.

This LGTM from a Cython perspective -- I haven't looked at the maths, yet. Doing another PR for improving this implementation w.r.t Cython technicalities is sensible to me. 👍

Let me know if you need another review for this PR generally.

@mdhaber
Copy link
Contributor Author

mdhaber commented Sep 19, 2022

@steppi would you like @jjerphan to review the math, too, or are you comfortable merging?

@steppi
Copy link
Contributor

steppi commented Sep 19, 2022

@steppi would you like @jjerphan to review the math, too, or are you comfortable merging?

The math looks good to me. I’m comfortable with merging.

@steppi steppi merged commit b5a8052 into scipy:main Sep 20, 2022
@mdhaber mdhaber added this to the 1.10.0 milestone Nov 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants