Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAINT: stats.monte_carlo_test: used biased estimate of p-value #16721

Merged
merged 1 commit into from Jul 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
13 changes: 10 additions & 3 deletions scipy/stats/_resampling.py
Expand Up @@ -674,6 +674,13 @@ def monte_carlo_test(sample, rvs, statistic, *, vectorized=None,
null_distribution : ndarray
The values of the test statistic generated under the null hypothesis.

References
----------

.. [1] B. Phipson and G. K. Smyth. "Permutation P-values Should Never Be
Zero: Calculating Exact P-values When Permutations Are Randomly Drawn."
Statistical Applications in Genetics and Molecular Biology 9.1 (2010).

Examples
--------

Expand Down Expand Up @@ -709,7 +716,7 @@ def monte_carlo_test(sample, rvs, statistic, *, vectorized=None,
>>> print(res.statistic)
0.12457412450240658
>>> print(res.pvalue)
0.701070107010701
0.7012

The probability of obtaining a test statistic less than or equal to the
observed value under the null hypothesis is ~70%. This is greater than
Expand Down Expand Up @@ -762,12 +769,12 @@ def monte_carlo_test(sample, rvs, statistic, *, vectorized=None,

def less(null_distribution, observed):
cmps = null_distribution <= observed
pvalues = cmps.sum(axis=0) / n_resamples
pvalues = (cmps.sum(axis=0) + 1) / (n_resamples + 1) # see [1]
return pvalues

def greater(null_distribution, observed):
cmps = null_distribution >= observed
pvalues = cmps.sum(axis=0) / n_resamples
pvalues = (cmps.sum(axis=0) + 1) / (n_resamples + 1) # see [1]
return pvalues

def two_sided(null_distribution, observed):
Expand Down
9 changes: 9 additions & 0 deletions scipy/stats/tests/test_resampling.py
Expand Up @@ -763,6 +763,15 @@ def statistic1d(x):
assert_allclose(res.statistic, expected_stat)
assert_allclose(res.pvalue, expected_p, atol=2*self.atol)

def test_p_never_zero(self):
# Use biased estimate of p-value to ensure that p-value is never zero
# per monte_carlo_test reference [1]
rng = np.random.default_rng(2190176673029737545)
x = np.zeros(100)
res = monte_carlo_test(x, rng.random, np.mean,
vectorized=True, alternative='less')
assert res.pvalue == 0.0001


class TestPermutationTest:

Expand Down