Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: interpolate: add RBFInterpolator #13595

Merged
merged 109 commits into from May 24, 2021

Conversation

treverhines
Copy link
Contributor

@treverhines treverhines commented Feb 22, 2021

Reference issue

Addresses issues/requests mentioned in #9904 and #5180

What does this implement/fix?

Two new classes have been added: RBFInterpolator and KNearestRBFInterpolator.

RBFInterpolator is a replacement for Rbf. The main differences are 1) RBFInterpolator has usage that more closely follows other scattered data interpolation classes, 2) RBFInterpolator includes polynomial terms in the interpolant, and 3) the sign of the smoothing parameter and some RBFs are corrected, which addresses the erroneous smoothing behavior like this #4790 (comment).

KNearestRBFInterpolator performs RBF interpolation using only the k-nearest observations to each interpolation point. This class can be a faster and more memory efficient alternative to RBFInterpolator when there are many observations (>10,000).

Additional information

The new classes deviate from the Rbf class in a few other ways that are worth noting:

  • The default RBF is the thin plate spline, whereas Rbf defaults to the multiquadric. I prefer using the thin plate spline over the multiquadric as a default because 1) it is relatively well-known, 2) the multiquadric is infinitely smooth, which can cause the interpolant to have oscillations similar to Runge's phenomenon, and 3) the thin plate spline is invariant to the shape parameter epsilon, meaning it should work well without any tuning.

  • epsilon must be specified if the chosen RBF is not scale invariant (i.e., when kernel is not "linear", "tps", "cubic", or "quintic"). My preference is to require the user to pick a shape parameter rather than default to a heuristic like using the average nearest neighbor distance.

  • There is no norm argument for the new classes, and they only use euclidean norms. I made this choice because most of the literature on RBFs assume a euclidean norm, and I would prefer to limit the functionality of these classes to what is well understood.

  • epsilon scales the RBF input as r*epsilon rather than r/epsilon, which is consistent with RBF literature (for example: https://www.sciencedirect.com/science/article/pii/S0898122107002210)

  • The names of the RBFs differ from those used in Rbf. The new names are shorter and consistent with abbreviations used in the RBF literature (see the article above)

edit: corrected misspelling of multiquadric

for the Rbf class and a class for RBF interpolation with the k nearest
neighbors.
ENH: The monomials are now also scaled by epsilon, which should make the
interpolant more numerically stable when the domain is very large/small.

STY: warnings and errors now have quotes around kernel names
ENH: Assert that interpolation points have the same number of dimensions as the
observation points
ENH: inf can be given for `k` in `KNearestRBFInterpolator` to use all
observations
… using

RBFInterpolator with the k nearest observations
ENH: Do not warn about polynomial degrees less than -1 since all negative
degrees behave the same

DOC: Added more to the description of `epsilon`
DOC: added that kernel can be callable in the documentation
…caled

differently to improve numerical stability
ENH: an error is raised in __init__ for KNearestRBFInterpolator if there are
too few observations
MAINT: created a function to sanitize the arguments for RBFInterpolator
and KNearestRBFInterpolator
@tylerjereddy tylerjereddy added enhancement A new feature or improvement scipy.interpolate labels Feb 22, 2021
Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certainly looks quite thorough with the testing side of things. I'm not a domain expert though so just added some superficial comments about the tests. Maybe best to wait for another reviewer before making changes in any case.

scipy/interpolate/tests/test_rbfinterp.py Outdated Show resolved Hide resolved
scipy/interpolate/tests/test_rbfinterp.py Outdated Show resolved Hide resolved
@stefanv
Copy link
Member

stefanv commented Feb 26, 2021

Thank you @treverhines, this is great! I am excited to try the nearest neighbor implementation.

Could you explain the chunking concept? Does this apply to higher dimensions as well, and how should these be chosen? Is that the type of guideline that should appear in the docstring?

DOC: Added description of default behavior

DOC: Replaced "interpolation points" with "evaluation points" in a comment
@rgommers
Copy link
Member

Is this ready from your point of view @treverhines?

@treverhines
Copy link
Contributor Author

I think this branch is in a good state, and there is nothing more that I want to add.

It looks like the CI failed due to an issue checking the pythran version

   File "/home/runner/.local/lib/python3.10/site-packages/Cython/Compiler/Pythran.py", line 12, in <module>
    pythran_is_pre_0_9_6 = tuple(map(int, pythran.__version__.split('.')[0:3])) < (0, 9, 6)
ValueError: invalid literal for int() with base 10: '12dev'

@rgommers
Copy link
Member

It looks like the CI failed due to an issue checking the pythran version

I already submitted a fix for that on the Pythran repo. 0.9.11 is released, so we can also go back to the non-dev version now.

@rgommers
Copy link
Member

I was getting errors like these on running the benchmarks in this branch at first:

               For parameters: 50, 10, 'inverse_quadratic'
               malloc(): invalid size (unsorted)
               
               For parameters: 50, 10, 'gaussian'
               double free or corruption (out)
               
               For parameters: 50, 100, 'cubic'
               malloc(): unaligned tcache chunk detected
               
               For parameters: 50, 100, 'quintic'
               munmap_chunk(): invalid pointer

before realizing it was due to using a Pythran version without the necessary fix in it. Since that fix was only just released, this is going to happen to others unless we add in a guard - I'll push a fix for that.

@rgommers rgommers changed the title ENH: added RBFInterpolator and KNearestRBFInterpolator ENH: interpolate: add RBFInterpolator May 24, 2021
Copy link
Member

@rgommers rgommers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went through it one more time, this all looks great. Let's get it in for 1.7.0:)

Thanks @treverhines, this is a very nice new feature! And thanks @stefanv and @tupui for reviewing.

@Jwink3101
Copy link
Contributor

This looks really exciting and while different than what I did (and abandoned) for #11212. Just skimming the source code, is there a reason some of the kernel evaluations are not vectorized?

Maybe I am missing it but can you specify a total power for the polynomial? So a 2 would be (in 2D) x + x**2 + y + y**2 + xy?

Final comment that I have raised before: there are analytical ways to get a leave-one-out error from the Gram matrix. They are super useful and a shame that other tools do not compute it (including things like Gaussian Processes in scikit-learn). But that may be a different discussion

@treverhines
Copy link
Contributor Author

treverhines commented May 24, 2021

Just skimming the source code, is there a reason some of the kernel evaluations are not vectorized?

The kernel functions are not vectorized because AFAIK they cannot be both vectorized and operate on the distance matrix in-place (pythran does not support having an out argument for the numpy ufuncs). I wanted to build the matrix being solved in-place because memory requirements are a limitation for RBFInterpolation. Keep in mind that _rbfinterp_pythran.py is compiled with pythran, so those scalar-valued kernel functions are much faster than if they were ran as python functions.

Maybe I am missing it but can you specify a total power for the polynomial? So a 2 would be (in 2D) x + x2 + y + y2 + xy

If you specify degree=2 and the input is 2-dimensional, the polynomial would consist of the monomial basis functions {1, x, y, xy, x**2, y**2}. I think you have the same behavior in your branch.

Final comment that I have raised before: there are analytical ways to get a leave-one-out error from the Gram matrix. They are super useful and a shame that other tools do not compute it (including things like Gaussian Processes in scikit-learn). But that may be a different discussion

I have thought a bit about adding generalized cross validation (and also generalized maximum likelihood) as a static method to this class. So the usage would look something like

optimal_smoothing = minimize_scalar(lambda s: RBFInterpolator.gcv(x, y, smoothing=s)).x
interp = RBFInterpolator(x,y, smoothing=optimal_smoothing)

I think this would be useful, but I figured it would be best to save it for another PR.

@stefanv
Copy link
Member

stefanv commented May 25, 2021

@treverhines Thank you for this fantastic contribution. This is a major step forward for RBFs in SciPy! I enjoyed reviewing your work, and loved to see how this PR transformed during the review process.

@tupui
Copy link
Member

tupui commented May 25, 2021

Thanks @treverhines for this great PR!

I have thought a bit about adding generalized cross validation (and also generalized maximum likelihood) as a static method to this class. So the usage would look something like

optimal_smoothing = minimize_scalar(lambda s: RBFInterpolator.gcv(x, y, smoothing=s)).x
interp = RBFInterpolator(x,y, smoothing=optimal_smoothing)

I think this would be useful, but I figured it would be best to save it for another PR.

Would be a good follow up PR. I am using LOOCV to resample and having a function to get this would be good.

@Jwink3101
Copy link
Contributor

Just to be clear, the LOOCV I was referring to do not need to recompute anything or loop. It is based on the inverse of the gram matrix directly.

Either way, it would be pretty cool to have

@Jwink3101
Copy link
Contributor

Just to further elaborate (now that I am at a computer), the formula I am talking about is 17.1 in G. E. Fasshauer. Meshfree Approximation Methods with MATLAB, volume 6 of Interdisciplinary Math- ematical Sciences. World Scientific, 2007 where they reference Rippa. An algorithm for selecting a good value for the parameter c in radial basis function interpolation. Advances in Computational Mathematics, 11(2-3):193–210, 1999.

@treverhines
Copy link
Contributor Author

@Jwink3101 thanks for pointing that out! The algorithm from Rippa 1999 seems to be complementary to generalized cross validation. Both are O(N^3) methods for computing LOOCV error, but Rippa 1999 seems to be for the case when there is no smoothing (correct me if I am wrong here), and GCV can only be used with a non-zero smoothing parameter.

@Jwink3101
Copy link
Contributor

Jwink3101 commented May 26, 2021

@treverhines

I would almost certainly need help with the mathematics but I think it can be shown that this works with smoothing parameters too.

To be fair, I actually originally thought of all of this with respect to Gaussian Process Regression (aka Kriging) which, at least for the mean function is the same mathematics for some positive definite kernels (including the polynomial though that is usually done through the argument of Best Linear Unbiased Predictor). Also, kriging usually has anisotropic kernels but RBFs could too.

As such, another reference that covers this nicely is as follows (specifically Section 2, eq 19-22):

J. D. Martin and T. W. Simpson. Use of Kriging Models to Approximate Deterministic Computer Models. AIAA journal, 43(4):853–863, 2005.

Anyway, the operation is on the Gram matrix, (well, it's inverse) so it shouldn't be affected (other than conditioning) by the inclusion of the error smoothing term which only appears on the diagonal.

Now, whether this can be proven for conditionally positive definite kernels (aka splines), I don't know but they don't mention that restriction in Fasshauer 2007.

A quick look (so I may have missed it) shows that Rippa and Fasshauer do not talk about the polynomial augmenting the RBF but Martin 2005 (above) does include it in the non-numbered equation block below (19).

So that is all to say, I think there are references to say it is sound. And I can tell you through brute force checking, it gets it right to ~0.1% (as an aside, polynomial least-squares fitting have the same type of calculation often called the PRESS and brute-force checking that is to machine precision).

And of course, brute-force confirmation does not a proof make. But those papers serve as references.

Both are O(N^3) methods for computing LOOCV error

This is true but depending on your implementation, you can reuse some work depending on how you solve the original equations.

In my personal implementation of this (I don't recall if it is in my PR), I attempt to solve the systems with a Cholesky Decomposition. If you store that, you can use that to compute the inverse. That only works for positive-definite kernels so for things like the TPS, you will need to rebuild and invert the matrix (though I suspect someone better at linear algebra than me could figure out an efficient way to do it. I just don't want to directly compute the inverse for solving the original equations and use scipy.linalg.solve instead)

But my implementation was not designed with memory footprint as well as yours and instead prioritized vector operations (since I didn't do it with any kind optimization of Pythran). So you may have to redo the inversions. Personally, the value in have the LOOCV is so great that I don't mind, though I don't compute it unless asked.

I hope this helps. Sorry I couldn't be more rigorous.

@treverhines
Copy link
Contributor Author

@Jwink3101 I spent some time playing with the Rippa 1999 formula for LOOCV, and I can attest that it is accurately giving me the LOOCV error regardless of whether I use non-zero smoothing, CPD kernels, or augmenting polynomials. Thanks again for pointing this out.

I also want to correct something I said above: GCV is motivated by LOOCV, but it is not equivalent to LOOCV. So the equation from Rippa 1999 and GCV are two distinct objective functions.

@lbgr0
Copy link

lbgr0 commented Aug 20, 2021

@treverhines really appreciate your work here!
Any reasons, why you did not include the option to calculate the derivatives directly when using the __call() function, as you did in the https://github.com/treverhines/RBF ?

Do you reccomend still using the code from your other project, or should I swith to the scipy implementation? How to get the derivatives here?

Thanks!

@treverhines
Copy link
Contributor Author

Any reasons, why you did not include the option to calculate the derivatives directly when using the __call() function, as you did in the https://github.com/treverhines/RBF ?

In order to compute derivatives of the interpolant, you need a function to compute the derivatives of the kernel used for the interpolant. I use sympy in my RBF package to symbolically differentiate the kernels and then compile the symbolic expressions into numerical function. It would be difficult to use sympy for this scipy implementation of RBF interpolation because 1) sympy is not currently a dependency of scipy and 2) we have taken the route of optimizing the implementation with pythran, which cannot use the functions generated by sympy. So if we want the ability to differentiate the interpolant, we would need to explicitly code up some reasonable number of kernel derivatives in pythran (say all first and second order derivatives for each kernel). It is doable, but more work than I am willing to commit to right now.

Do you reccomend still using the code from your other project, or should I swith to the scipy implementation? How to get the derivatives here?

I would recommend continuing to use my RBF package if you want to compute analytical derivatives of the interpolant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants