-
-
Notifications
You must be signed in to change notification settings - Fork 5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add relativistic Breit-Wigner Distribution #17505
Conversation
Hi @steppi this is looking good, but the doc string needs a pass. It is not the center-of-mass energy, M is just the mass of the particle, which is invariant = the same in all reference frames. We sometimes call it "invariant mass" to stress that, although that is a tautology, strictly speaking. You may have learned the false concept of the "moving mass" or so (not sure what it is called in english), which is taught in school, and which is not invariant. When you study proper physics you have to unlearn that. |
Thanks! Here center-of-mass energy is supposed to refer to Yes, I learned about rest mass vs relativistic mass. It’s funny, this set of lecture notes I found when looking up the term center-of-mass energy discusses invariant mass and says that rest mass isn’t a very useful concept, but I didn’t catch notice of that until you mentioned the same thing just now. Is center-of-mass energy an acceptable name for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @steppi! I took a quick look at documentation and tests. The main comment about tests is that generic tests that apply to all distributions are taken care of in test_continuous_basic
. If there is a hole in those tests, please help fill it in a separate PR, but here, we could use tests that show that you are implementing the right distribution. Are there any published values you can find, or is there a reference implementation you can compare against? Could you test against a one-line implementation that you include right in the test function. This usually only makes sense if there is some simpler way to write the functions. If there isn't one, I could write implementations that you could use as independent checks. I'll probably be checking the math with SymPy in my next pass, so I could generate reference values from that.
(96292.3076923077, 0.0013), # Higgs Boson | ||
] | ||
) | ||
def test_pdf_against_cdf(self, rho, gamma): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a PDF-CDF consistency check in test_continuous_basic.py
, IIRC.
|
||
def _stats(self, rho): | ||
# Returning None from stats makes public stats use _munp. | ||
return None, None, np.nan, np.nan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't checked, but do these need to be np.inf
, or does np.inf
need to be np.nan
above? Or does it end up returning the same thing either way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had tried both and I think what happened is I ended up leaving nans because the results ended up being the same. I’ll check again though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I still need to address this one. They should be inf not nan. The raw and central moments of order greater than 2 are infinite, not undefined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These still report nan
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a test failure when I changed them to np.inf
, and I resorted to keeping them as np.nan
instead of investigating further. I can look into it again if you like. I noticed that halfcauchy also has nan
for these values, despite them actually being infinite, so maybe np.nan
is reasonable?
def _stats(self):
return np.inf, np.inf, np.nan, np.nan
@mdhaber, for checking against known values, I wasn’t able to find any. The formula for the pdf (except for the constant |
Yes, I agree that the relationships between the functions make it unlikely for there to be a problem as long as the generic tests pass and the PDF is correct. And the PDF is quite easily checked manually, but I think it's good to add at least some automatic check of the PDF - something to permanently tie the suite of tests to the intended distribution. I will suggest something when I review again. |
Thanks @mdhaber! I'm waiting for the docs to rebuild to check locally before pushing the improvements you suggested for those. I'll look into the |
Ok, if it turns out that this is a fight against the distribution infrastructure, no need to agonize over it. We can leave these as NaNs and I can add it to my meta-issue, which I'll work on this summer. |
@mdhaber. I updated the docs. The failing test involves def check_skew_expect(distfn, arg, m, v, s, msg):
if np.isfinite(s):
m3e = distfn.expect(lambda x: np.power(x-m, 3), arg)
npt.assert_almost_equal(m3e, s * np.power(v, 1.5),
decimal=5, err_msg=msg + ' - skew')
else:
npt.assert_(np.isnan(s)) |
I'm having trouble finding a reliable source that distinguishes between moments being undefined and infinite. Can you find an example or maybe a software implementation that makes a distinction? I can see how the distinction could be useful, but I hesitate to make a change that would tread new ground when it comes to subtleties like this. If we can't find precedent, let's stick with NaN. If we can, you can change the test. |
Let's stick with Should I take out the |
Edit: Nevermind. This only explains why these are @mdhaber. Wait,
|
Sure, changing all at once (separately) sounds good to me. @WarrenWeckesser looked like this was ready for you to merge. The whitespace issue you noticed has been fixed. |
I'll take another look at this (and probably merge) on Sunday. |
Great. I'll resolve the conflicts tonight. CI failure, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a question in-line about the distribution being skipped in cases_test_fit_mle
in test_fit.py
. I also found that if I run the stats
tests with SCIPY_XSLOW
enabled, i.e.
SCIPY_XSLOW=1 python dev.py test -m full -j 10 -s stats
I get this failure:
============================================================= FAILURES ==============================================================
________________________ TestNumericalInversePolynomial.test_basic_all_scipy_dists[rel_breitwigner-params91] ________________________
[gw3] linux -- Python 3.10.8 /home/warren/py3.10.8/bin/python3
scipy/stats/tests/test_sampling.py:832: in test_basic_all_scipy_dists
check_cont_samples(rng, dist, [dist.mean(), dist.var()])
dist = <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7f0f15c3eb30>
distname = 'rel_breitwigner'
params = (36.545206797050334,)
rng = <scipy.stats._unuran.unuran_wrapper.NumericalInversePolynomial object at 0x7f0f15ae5240>
self = <scipy.stats.tests.test_sampling.TestNumericalInversePolynomial object at 0x7f0f15fb9300>
sup = <numpy.testing._private.utils.suppress_warnings object at 0x7f0f15bd6f50>
scipy/stats/tests/test_sampling.py:300: in check_cont_samples
assert_allclose(mv, mv_ex, rtol=1e-7, atol=1e-1)
dist = <scipy.stats._distn_infrastructure.rv_continuous_frozen object at 0x7f0f15c3eb30>
mv = (36.227478105538346, 20.51502968775722)
mv_ex = [36.23714553411454, 22.921329819350603]
rng = <scipy.stats._unuran.unuran_wrapper.NumericalInversePolynomial object at 0x7f0f15ae5240>
rvs = array([36.30467119, 39.08801556, 36.94529237, ..., 36.81422772,
36.51706143, 36.33370805])
/home/warren/py3.10.8/lib/python3.10/contextlib.py:79: in inner
return func(*args, **kwds)
E AssertionError:
E Not equal to tolerance rtol=1e-07, atol=0.1
E
E Mismatched elements: 1 / 2 (50%)
E Max absolute difference: 2.40630013
E Max relative difference: 0.10498083
E x: array([36.227478, 20.51503 ])
E y: array([36.237146, 22.92133 ])
args = (<function assert_allclose.<locals>.compare at 0x7f0f15a52320>, array([36.22747811, 20.51502969]), array([36.23714553, 22.92132982]))
func = <function assert_array_compare at 0x7f0fe26f3400>
kwds = {'equal_nan': True, 'err_msg': '', 'header': 'Not equal to tolerance rtol=1e-07, atol=0.1', 'verbose': True}
self = <contextlib._GeneratorContextManager object at 0x7f0fe2709060>
====================================================== short test summary info ======================================================
FAILED scipy/stats/tests/test_sampling.py::TestNumericalInversePolynomial::test_basic_all_scipy_dists[rel_breitwigner-params91] - ...
=============================== 1 failed, 9104 passed, 165 skipped, 133 xfailed in 198.86s (0:03:18) ================================
@tirthasheshpatel, could you take a look at this?
scipy/stats/tests/test_fit.py
Outdated
@@ -218,7 +220,8 @@ def test_nnlf_and_related_methods(dist, params): | |||
def cases_test_fit_mle(): | |||
# These fail default test or hang | |||
skip_basic_fit = {'argus', 'foldnorm', 'truncpareto', 'truncweibull_min', | |||
'ksone', 'levy_stable', 'studentized_range', 'kstwo'} | |||
'ksone', 'levy_stable', 'studentized_range', 'kstwo', | |||
'rel_breitwigner'} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove rel_breitwigner
from this list, the test passes. Did you actually get a failure for this test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find. It was failing originally but I didn’t retry after making the updates you’d suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll resolve merge conflicts and mark rel_breitwigner
xslow
in the MLE and MSE tests.
I don't think the test failure in import numpy as np
from scipy import stats
rng = np.random.default_rng(23079058616272161)
dist = stats.rel_breitwigner(36.545206797050334)
rvs = dist.rvs(100000) # wait a long time
print(rvs.mean(), dist.mean()) # 36.24488859010438 36.23714553411454
print(rvs.var(), dist.var()) # 41.712043010121754 22.921329819350603 The sample variance really doesn't match the variance returned by the A histogram of the sample produced above matches the PDF pretty closely: import matplotlib.pyplot as plt
a, b = 30, 42.5
x = np.linspace(a, b, 200)
y = dist.pdf(x)
plt.hist(rvs, density=True, bins=np.linspace(a, b, 100))
plt.plot(x, y)
plt.title("RVS Histogram vs PDF")
plt.show() But sorted = np.sort(rvs)
np.var(sorted) # 41.712043010121754
np.var(sorted[:-1]) # 29.482895293876066
np.var(sorted[:-2]) # 25.174836558478603 This could be throwing off the sample variance. The bootstrap CI is quite wide. stats.bootstrap((rvs,), np.var, method='percentile', batch=100).confidence_interval
# ConfidenceInterval(low=21.29429430508737, high=72.76858132195494) @WarrenWeckesser With this in mind, do you expect this test to pass? I'm skipping it in the meantime. (More specifically, UNURAN samples probably have systematically lower sample variance than the distribution because they truncate the right tail, but overall, this probably tends to decrease the difference between sample and true variance.) |
@@ -829,6 +836,8 @@ def test_basic_all_scipy_dists(self, distname, params): | |||
with suppress_warnings() as sup: | |||
sup.filter(RuntimeWarning) | |||
rng = NumericalInversePolynomial(dist, random_state=42) | |||
if distname in skip_sample_moment_check: | |||
return | |||
check_cont_samples(rng, dist, [dist.mean(), dist.var()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While we're at it, this is probably preferable:
check_cont_samples(rng, dist, [dist.mean(), dist.var()]) | |
check_cont_samples(rng, dist, [dist.mean(), dist.var(ddof=1)]) |
very_slow_dists = ['anglit', 'gausshyper', 'kappa4', | ||
'ksone', 'kstwo', 'levy_l', | ||
'levy_stable', 'studentized_range', | ||
'trapezoid', 'triang', 'vonmises'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved these from above because they are only used in test_basic_all_scipy_dists
. I alphabetized them per the reviewer's preference.
Thanks @mdhaber for picking up the slack while I’ve been too busy to work on this. For the sample variance test. The distribution of the sample variance is fat tailed with finite mean and infinite variance, so it’s no surprise this can fail. If which decays like for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One final style adjustment. I'll commit the suggestion and merge.
* main: (51 commits) ENH: stats.ttest_ind: add degrees of freedom and confidence interval (scipy#18210) DOC fix link in release notes v1.7 (scipy#18258) BLD: fix missing build dependency on cython signature .txt files DOC: Clarify executable tutorials and MyST/Jupytext ENH: stats: Add relativistic Breit-Wigner Distribution (scipy#17505) MAINT: setup.sh as executable file DOC: remove content related to `setup.py` usage from the docs (scipy#18245) DOC: orthogonal_procrustes fix date of reference paper and DOI (scipy#18251) BLD: implement version check for minimum Cython version ci: touch up wheel build action DOC: link to info about Gitpod. ENH: ndimage.gaussian_filter: add `axes` keyword-only argument (scipy#18016) BUG: sparse: Fix LIL full-matrix assignment (scipy#18211) STY: Remove unnecessary semicolon DOC: Pin docutils and filter additional sphinx warnings BUG: vq.kmeans() compares signed diff to a threshold. (scipy#8727) MAINT: stats: consistently return NumPy numbers (scipy#18217) TST: stats.dunnett: fix seed and filter warnings in test_shapes DOC: add info to use codespaces. MAINT: add devcontainer configuration ...
Reference issue
Closes gh-6414
What does this implement/fix?
This PR adds the Relativistic Breit-Wigner Distribution, a distribution used in high energy physics to model resonances, aka unstable particles.
Additional information
Distribution and moments
Typically, this distribution is parametrized in terms of the mass$M$ and decay width $\Gamma$ for the resonance of interest. We're using an alternative loc-scale parametrization given by @WarrenWeckesser in the discussion for gh-6414.
In the loc-scale parametrization, the pdf of the distribution is
where
The shape parameter$\rho$ is equal to $M/\Gamma$ , the mass of the resonance divided by its decay width, and $\Gamma$ is the scale parameter.
There is a closed form for the CDF (now ignoring the scale parameter)
where$c = \sqrt{-\rho(\rho + i)}$ and $\bar{c}$ is its complex conjugate.
This can be found by using the factorization
expanding the expression for the pdf with partial fractions and making use of the well known integral
There are also closed formulas for the first and second moments that have been used. My son woke up so I can't TeX these up now, but I can add them later. Higher moments don't exist.
Assumption for rho
I checked some actual$M$ , $\Gamma$ values for resonances (these are included in the tests), and found that typically $\rho$ is never small and can be very large. I tried to implement things to work well for large $\rho$ . That's why the formulas look a little different from what's stated above.
Fitting
I've overridden$\Gamma$ being approximately the interquartile range and $\rho \Gamma$ being approximately the median to get good starting values. (shifting appropriately if
rv_continuous
's defaultfit
method to better handle the case whenfloc
is set. In this case we can take advantage offloc
is set to something other than zero).Complex step derivatives
The complex step derivative test (gh-4979) fails. The results are close, but fail to meet the
1e-5
tolerance. I've added this distribution to thefails_cmplx
list. Perhaps a tweak could improve the results enough to meet the1e-5
tolerance.Naming
Provisionally, I've gone with the name
relativistic_bw
, but I'm not committed to this if anyone has a better suggestion.cc
@mdhaber please take a look when you have time. @HDembinski ,if time permits, let me know if the docstring looks OK. I took some graduate physics classes many years ago, but I'm not a physicist and may have written something that's off.
Since it's an enhancement, I'll make a post on the mailing list within a day or so.