Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: improve precision of folded normal distribution cdf #18207

Merged
merged 2 commits into from
Mar 28, 2023

Conversation

dschmitz89
Copy link
Contributor

@dschmitz89 dschmitz89 commented Mar 28, 2023

Reference issue

No directly related issue.

What does this implement/fix?

Changes the implementation of foldnorm CDF using the formulation from Wikipedia to avoid subtracting 1. This improves precision and range in the left tail.

Additional information

Comparison of main branch, PR and mpmath

image

from scipy import stats
from mpmath import mp
import numpy as np
import matplotlib.pyplot as plt
from scipy import special as sc

mp.dps = 50

@np.vectorize
def foldnorm_cdf_mpmath(x, c):
    x = mp.mpf(x)
    c = mp.mpf(c)
    return mp.mpf(0.5) * (mp.erf((x - c)/mp.sqrt(2)) + mp.erf((x + c)/mp.sqrt(2)))

a, b = 1e-8, 1e100
c = 1e-8
x = np.logspace(-50, 10, 10000)
plt.loglog(x, stats.foldnorm.cdf(x, c), label="main", ls='dotted')
plt.loglog(x, 0.5 * (sc.erf((x - c)/np.sqrt(2)) + sc.erf((x + c)/np.sqrt(2))), label="PR", ls='dashdot')
plt.loglog(x, np.array(foldnorm_cdf_mpmath(x, c), np.float64), label="mpmath", ls='dashed')
plt.title(f"Foldnorm CDF: $c={c}$")
plt.legend()
plt.show()

Relative error

The relative error is very low for $>c$. Below it also increases with the new formulation but is still smaller than the main branch.

image

image

c = 1e-4
x = np.logspace(-20, 3, 1000)
mp.dps = 100

ref = np.array(foldnorm_cdf_mpmath(x, c), np.float64)

plt.loglog(x, np.abs((stats.foldnorm.cdf(x, c) - ref)/ref), label="main", alpha=0.5)
plt.loglog(x, np.abs((0.5 * (sc.erf((x - c)/np.sqrt(2)) + sc.erf((x + c)/np.sqrt(2))) - ref)/ref), label="PR", alpha=0.5)
plt.axvline(c, 0, 1, c='k', ls='dashed', label="$c$")
plt.legend()
plt.title(f"Foldnormal CDF relative error: $c={c}$")
plt.show()

@j-bowhay j-bowhay added scipy.stats enhancement A new feature or improvement labels Mar 28, 2023
@mdhaber mdhaber merged commit 02fb953 into scipy:main Mar 28, 2023
# reference values were computed with mpmath with 50 digits of precision
# from mpmath import mp
# mp.dps = 50
# mp.mpf(0.5) * (mp.erf((x - c)/mp.sqrt(2)) + mp.erf((x + c)/mp.sqrt(2)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could (in some cases) be important to ensure that x and c are mpf objects before doing arithmetic between them, but it doesn't happen to affect the results here.

@j-bowhay j-bowhay added this to the 1.11.0 milestone Mar 28, 2023
@dschmitz89 dschmitz89 deleted the foldednormal_cdf branch March 29, 2023 20:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants