Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add cumulative_simpson integration to scipy.integrate #18151

Merged
merged 51 commits into from
Dec 6, 2023

Conversation

nprav
Copy link
Contributor

@nprav nprav commented Mar 15, 2023

Reference issue

N/A

What does this implement/fix?

The cumulative integration and differentiation of numerical samples is a common operation in the engineering domains.
Scipy currently only provides the integrate.cumulative_trapezoid function, but this is insufficiently accurate for engineering purposes. The integrate.simpson method is available but only as a complete integral, not a cumulative integral. This PR adds an implementation for the cumulative integration of an array of samples using the Simpson's 1/3 rule (wiki link).

CAE softwares such as ANSYS/LS-DYNA provide simpson's integration as a default processing option. Adding this to Scipy.integrate will help expand usage in the engineering domains.

Additional information

Refer to this jupyter notebook: Cumulative Simpson derivation for further information.

@j-bowhay j-bowhay added enhancement A new feature or improvement scipy.integrate needs-decision Items that need further discussion before they are merged or closed labels Mar 15, 2023
Copy link
Member

@tupui tupui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR @nprav!

Before we start any review, we need to get the approval for inclusion in SciPy. The process involves sending an email to the dev mailing list. Could you do that?

https://mail.python.org/mailman3/lists/scipy-dev.python.org/

@nprav
Copy link
Contributor Author

nprav commented Mar 15, 2023

Hi @tupui , I have emailed scipy-dev-owner@python.org and scipy-dev@python.org.
My email to scipy-dev@python.org got rejected since I still has a subscription request pending to be a list member.
Thanks!

@tupui
Copy link
Member

tupui commented Mar 15, 2023

@nprav thank you. The first time you send an email to the list, an admin needs to validate. Once this is done your email should go through.

@tupui
Copy link
Member

tupui commented Mar 23, 2023

@MatteoRaso Thank you for starting a review but please wait until there is a decision from a maintainer. We don't want people to spend time on something that could in the end be rejected.

EDIT: we now got an approval 😃

@tupui tupui removed the needs-decision Items that need further discussion before they are merged or closed label Mar 23, 2023
Copy link
Contributor

@lorentzenchr lorentzenchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some review comments. IMHO, it is already looking quite good and a very useful addition to scipy.

scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
scipy/integrate/tests/test_quadrature.py Outdated Show resolved Hide resolved
@andyfaff
Copy link
Contributor

A comment from the sidelines. I would expect such a function to have the same functionality as the existing integrate.simpson function, and for its behaviour to eventually reduce to the existing function.

i.e. it should handle end points similarly, and the behaviour of overlapping keywords should be the same, etc.

@nprav
Copy link
Contributor Author

nprav commented Mar 27, 2023

Thank you for the comment @andyfaff ! This is an important discussion point because there are currently some differences in the function signatures. Please find the differences and justifications below and let me know what you think.

Current function signatures:

  • simpson(y, x=None, dx=1.0, axis=-1, even='avg')
  • cumulative_trapezoid(y, x=None, dx=1.0, axis=-1, initial=None)
  • cumulative_simpson(y, x=None, dx=1.0, axis=-1, initial=None)

Comparisons

Integrate.simpson(y, x=None, dx=1.0, axis=-1, even="avg")

  • y, x, dx, axis work the same way
  • even is not used in cumulative_simpson

The 'even' keyword relates to what happens if there are an odd number of intervals. Integrate.simpson will necessarily apply a trapezoidal rule for the odd interval. However, this approach violates the assumption of a quadratic relationship. In reality, the final interval can be approximated with a modified formula while maintaining the quadratic assumption. This is described in the wikipedia article.
image

It is also depicted with figures here: Cumulative Simpson Derivation.

Because of the above, I had implemented cumulative_simpson without the trapezoidal rule logic. This also ensures that when applied to samples with a quadratic relationship, cumulative_simpson will always give the exact solution. On the other hand, integrate.simpson will be inaccurate if there are an odd number of intervals.

I also compared accuracy when integrating oscillatory functions and found cumulative_simpson to be exhibit equal or better accuracy than integrate.simpson (above notebook link).

cumulative_trapezoid(y, x=None, dx=1.0, axis=-1, initial=None)

  • y, x, dx, axis work the same way.
  • Initial is slightly different
    • In cumulative_trapezoid, initial is simply prepended to the result array.
    • In cumulative_simpson, initial is prepended to the result array AND summed to every value in the result

The second approach is more 'correct' because the 'initial' value is analogous to a constant of integration that must be added to every point in the cumulative integral. Consider the following practical example:

  • Let y be sampled velocity data of an oscillating object. We wish to cumulatively integrate to attain the displacement of the object. We know the starting position (initial value) is 10m. Without the initial value input, the displacement data will show oscillations around 0m. With the initial value, the displacement should show oscillations around 10m.
  • But, with the cumulative_trapezoid implementation, the first value would be 10m, and the object would then be oscillating around 0m, which is incorrect. The summing approach would fix this.

@andyfaff
Copy link
Contributor

I don't have time to go through this PR myself (plus I'm not an expert in this area of the codebase). However, I should reiterate that behaviour for the proposed PR should tally exactly with existing behaviour for simpson and cumulative_trapezoid.

For example:

-Initial is slightly different
-In cumulative_trapezoid, initial is simply prepended to the result array.
-In cumulative_simpson, initial is prepended to the result array AND summed to every value in the result

The cumulative_simpson would not keep with existing behaviour of cumulative_trapezoid because it adds initial to every value in the result. This would create confusion with users because of the differing behaviour. See #125 for why cumulative_trapezoid has that behaviour.

Similarly I would expect the exact same behaviour for simpson + cumulative_simpson, down to the behaviour of the even keyword being included, which is not what you outlined here. Different behaviour leads to confusion.

However, I do see the benefit of using the Cartwright2017 approach, as you linked to on Wikipedia. It may be that there's benefit to changing the existing code in simpson to take that into account, but that would have to be discussed more thoroughly as it has to take into account back-compatibility.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 28, 2023

Crossref gh-5618 and gh-9871. Please consider linking gh-5618.

@andyfaff
Copy link
Contributor

andyfaff commented Mar 28, 2023

@mdhaber I'ce prepped a fix for #5618 in https://github.com/andyfaff/scipy/tree/simpson. The mechanics are there but the API needs a bit of discussion. E.g.. how do we deal with back compat, do we ignore even kwd, add an extra option to even, what's that option called, is that option default, etc.

@mdhaber
Copy link
Contributor

mdhaber commented Mar 28, 2023

Yup, your recent email reminded me of those issues.

@nprav
Copy link
Contributor Author

nprav commented Dec 3, 2023

Thanks @nprav. I think we'll finish this before we branch. I just want to make sure it's all expressed in a way that can easily be maintained. It looks like you've been working on this for a long time and have gone through several reviewers, so I don't want to push your patience. If there are other things (like the input validation and corresponding tests) that are technically correct but can be expressed more clearly (it seems that you agreed that the changes were improvements), are you willing to help with that?

Yes @mdhaber, I'm willing to help with any specific comments.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some thoughts on the actual implementation to make the code easier to verify and improve performance.

scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
scipy/integrate/_quadrature.py Outdated Show resolved Hide resolved
@mdhaber
Copy link
Contributor

mdhaber commented Dec 5, 2023

FAILED scipy/integrate/tests/test_quadrature.py::TestCumulativeSimpson::test_simpson_exceptions[`axis=4` is not valid for `y` with `y.ndim=1`-kwarg_update4] - AttributeError: module 'numpy' has no attribute 'AxisError'

Not sure why this is failing on just one platform. Maybe it does need to be numpy.exceptions.AxisError, not just numpy.AxisError?

Maybe except IndexError (which it subclasses) if we can't use AxisError.

@mdhaber
Copy link
Contributor

mdhaber commented Dec 6, 2023

After #18151 (review) and nprav#2 are considered, I think this is good to go. Only merge them if you agree; LMK either way. Thanks for your patience @nprav!

nprav and others added 3 commits December 5, 2023 19:56
Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>
TST: integrate.cumulative_simpson: simplify and strengthen tests
Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some details to improve, but let's get this in. Thanks @nprav!

@mdhaber mdhaber merged commit 7f397e0 into scipy:main Dec 6, 2023
28 of 29 checks passed
@nprav
Copy link
Contributor Author

nprav commented Dec 6, 2023

Thank you all for helping to add this to SciPy! @mdhaber @tupui @andyfaff @lucascolley @lorentzenchr @MatteoRaso @j-bowhay

@lucascolley
Copy link
Member

thanks @nprav and congratulations on your first contribution to SciPy! Keep them coming :)

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nprav Here are some things to consider for follow-up work. If you are willing to do this soon, we can get it in the version of the function that is released in SciPy 1.12.

There are other things about documentation that I'd change in all the related function, but I've avoided commenting about those sorts of things here if they were just copied over.

scipy/integrate/_quadrature.py Show resolved Hide resolved
scipy/integrate/_quadrature.py Show resolved Hide resolved
scipy/integrate/_quadrature.py Show resolved Hide resolved
scipy/integrate/_quadrature.py Show resolved Hide resolved
scipy/integrate/_quadrature.py Show resolved Hide resolved
scipy/integrate/tests/test_quadrature.py Show resolved Hide resolved
@nprav
Copy link
Contributor Author

nprav commented Dec 10, 2023

@mdhaber , should i add the follow up changes in a separate PR?

@mdhaber
Copy link
Contributor

mdhaber commented Dec 10, 2023

Yes, thanks!

@nprav
Copy link
Contributor Author

nprav commented Dec 17, 2023

@mdhaber
Working on these comments in: https://github.com/nprav/scipy/tree/cumulative_simpson-clean-up
Will add a PR to the maintenance branch soon.

Regarding the case of "... elements of x identical"
If all elements of x are identical, it is easy: integral is 0.

But if only some elements are identical, it is tricky. Eg: x = [0, 0, 1, 1, 2, 2]
This causes ZeroDivisionErrors in the internal functions.
In Simpson, this is treated by just skipping the ZeroDivisionError points and just taking them as 0: https://github.com/scipy/scipy/blob/main/scipy/integrate/_quadrature.py#L560

We can do the same in cumulative_simpson, but the results may not match with simpson because cumulative_simpson explicitly calculates the intermediate points. More importantly, this does not make much mathematical sense. I think it should be up to the user to determine how to clean/handle their inputs. At the same time, I understand we want to keep the API consistent.

Do you have any thoughts on this?

@lucascolley
Copy link
Member

Will add a PR to the maintenance branch soon.

All PRs should be to main, then the release manager can backport.

@nprav
Copy link
Contributor Author

nprav commented Dec 19, 2023

@mdhaber
Added #19709 for the follow-up comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.integrate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants