-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: stats.ttest_ind: add degrees of freedom and confidence interval #18210
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some self-review to help the reviewer.
@@ -7118,50 +7118,8 @@ def ttest_ind_from_stats(mean1, std1, nobs1, mean2, std2, nobs2, | |||
return Ttest_indResult(*res) | |||
|
|||
|
|||
def _ttest_nans(a, b, axis, namedtuple_type): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer used, and the _axis_nan_policy
approach makes it obsolete, so it can be removed. TBH, it looks like _broadcast_shapes_with_dropped_axis
and _shape_with_dropped_axis
can be removed, too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. Feel free to remove those.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed _broadcast_shapes
and replaced it with np.broadcast_shapes
, too. (np.broadcast_shapes
was new in NumPy 1.20.0, so we didn't have that to work with.)
scipy/stats/_stats_py.py
Outdated
a, b, axis = _chk2_asarray(a, b, axis) | ||
|
||
# check both a and b | ||
cna, npa = _contains_nan(a, nan_policy) | ||
cnb, npb = _contains_nan(b, nan_policy) | ||
contains_nan = cna or cnb | ||
if npa == 'omit' or npb == 'omit': | ||
nan_policy = 'omit' | ||
|
||
if contains_nan and nan_policy == 'omit': | ||
if permutations or trim != 0: | ||
raise ValueError("nan-containing/masked inputs with " | ||
"nan_policy='omit' are currently not " | ||
"supported by permutation tests or " | ||
"trimmed tests.") | ||
a = ma.masked_invalid(a) | ||
b = ma.masked_invalid(b) | ||
return mstats_basic.ttest_ind(a, b, axis, equal_var, alternative) | ||
|
||
if a.size == 0 or b.size == 0: | ||
return _ttest_nans(a, b, axis, Ttest_indResult) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is no longer needed due to the addition of the _axis_nan_policy
decorator.
# when nan_policy='omit', `df` can be different for different axis-slices | ||
df = np.broadcast_to(df, t.shape)[()] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copied from ttest_rel
. If it was needed there, it is needed here.
@@ -4804,7 +4804,7 @@ def test_ttest_ind_permutation_check_inputs(self): | |||
stats.ttest_ind(self.a2, self.b2, permutations=1.5) | |||
with assert_raises(ValueError, match="'hello' cannot be used"): | |||
stats.ttest_ind(self.a, self.b, permutations=1, | |||
random_state='hello') | |||
random_state='hello', axis=1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default axis=0
is invalid for self.a
and self.b
, but the errors were raised in a different order before, so we didn't notice.
# 0.3873582, 0.35187468, 0.21731811) | ||
# yuen.t.test(a, b, tr=0, conf.level = 0.9, alternative = 'l') | ||
# | ||
# equal_var=True reference values computed with R multicon yuenContrast: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, forgot to include this:
# equal_var=True reference values computed with R multicon yuenContrast: | |
# equal_var=True reference values computed with R multicon yuenContrast: | |
# | |
# library(multicon) | |
# options(digits=16) | |
# a < - c(0.88236329, 0.97318744, 0.4549262, 0.97893335, 0.0606677, | |
# 0.44013366, 0.55806018, 0.40151434, 0.14453315, 0.25860601, | |
# 0.20202162) | |
# b < - c(0.93455277, 0.42680603, 0.49751939, 0.14152846, 0.711435, | |
# 0.77669667, 0.20507578, 0.78702772, 0.94691855, 0.32464958, | |
# 0.3873582, 0.35187468, 0.21731811) | |
# dv = c(a, b) | |
# iv = c(rep('a', length(a)), rep('b', length(b))) | |
# yuenContrast(dv~iv, EQVAR = FALSE, alternative = 'unequal', tr = 0.2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done even though it's not recognizing as outdated.
scipy/stats/tests/test_stats.py
Outdated
assert_allclose(res.statistic, statistic) | ||
assert_allclose(res.df, df) | ||
assert_allclose(res.pvalue, pvalue) | ||
if not equal_var: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not equal_var: | |
if not equal_var: # CI not available when `equal_var is True` |
r = np.empty(shape=(3, 2, 2, 5)) | ||
r[0, 0, 0] = [-0.2314607, 19.894435, 0.8193209, -0.247220294, 0.188729943] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't set out thinking that I would populate the array this way. r
started out as a dictionary, and I went through a lot of iterations to figure out how to store the data compactly within the PEP8 line limit. If the reviewer would prefer for this to be refactored, please make a specific suggestion, but I would prefer to do so under the following constraints:
- One line per case
- Include sufficient precision to use default
assert_allclose
tolerances.
One thought it that I could eliminate the statistic and p-value, since in theory those are already validated elsewhere. But I done that, I would not have caught the equal_var
discrepancy, so I think it's good to leave them in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The formatting looks good to me, especially given the comment above about what each slice of the array is nad how it's computed.
@@ -5023,6 +5020,64 @@ def test_trim_bounds_error(self, trim): | |||
stats.ttest_ind([1, 2], [2, 1], trim=trim) | |||
|
|||
|
|||
class Test_ttest_CI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In retrospect, there was not really a need for this to be in a separate class. If the reviewer wants to see this elsewhere, let me know where.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at other tests, it seems like the precedent is to create a new class for testing a new argument. So, I don't think we need to change this. Maybe later we can merge all these classes to avoid create a new one everytime something new is added.
CI failures are unrelated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @mdhaber! Looks incredibly clean, didn't find anything else other than your comments and suggestions. I also confirmed all the reference values in the test with R. Feel free to add the commits you have suggested and we can merge!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now! Thanks @mdhaber!
* main: (51 commits) ENH: stats.ttest_ind: add degrees of freedom and confidence interval (scipy#18210) DOC fix link in release notes v1.7 (scipy#18258) BLD: fix missing build dependency on cython signature .txt files DOC: Clarify executable tutorials and MyST/Jupytext ENH: stats: Add relativistic Breit-Wigner Distribution (scipy#17505) MAINT: setup.sh as executable file DOC: remove content related to `setup.py` usage from the docs (scipy#18245) DOC: orthogonal_procrustes fix date of reference paper and DOI (scipy#18251) BLD: implement version check for minimum Cython version ci: touch up wheel build action DOC: link to info about Gitpod. ENH: ndimage.gaussian_filter: add `axes` keyword-only argument (scipy#18016) BUG: sparse: Fix LIL full-matrix assignment (scipy#18211) STY: Remove unnecessary semicolon DOC: Pin docutils and filter additional sphinx warnings BUG: vq.kmeans() compares signed diff to a threshold. (scipy#8727) MAINT: stats: consistently return NumPy numbers (scipy#18217) TST: stats.dunnett: fix seed and filter warnings in test_shapes DOC: add info to use codespaces. MAINT: add devcontainer configuration ...
Reference issue
Closes gh-15906
Followup of gh-16902, gh-16835
What does this implement/fix?
This adds a
df
attribute andconfidence_interval
method to the result object returned byttest_ind
.It also wraps
ttest_ind
with the_axis_nan_policy
decorator, adding support for masked arrays and axis tuples, cleaning up edge cases, etc.Additional information
Adding
df
andconfidence_interval
for a permutation t-test is possible but not in scope. Theoretically, addingdf
is easy because it is the same as for a regular t-test, but because the calculation ofdf
is buried deep within a different branch when doing a permutation t-test, returning it without recomputing it would require a lot of plumbing. Rather than burden the reviewer with additional complexity, I've documented the limitation. Adding a permutation test confidence interval is a valid feature request but best left up to the gh-18067 effort.While writing this PR, I found two potential bugs. Investigating/fixing these is not in scope of this PR because they seem to be bugs in main (but I will open issues to track them once I get a better understanding of them):OurUpdate: I'm no longer concerned by this. See ENH: support trimming in ttest_ind #13696 (comment).equal_var=True
trimmed t-test with unequal sample sizes does not seem to match the results of the Rmulticon
libraryyuenContrast
function. More specifically, the degrees of freedom added by this PR do match, but the statistic and p-value that we already report in main don't match. I've xfailed the relevant test.I tried to runUpdate: fixed by MAINT: stats.TTestResult: fix NaN bug in ttest confidence intervals #18222 and successfully added test here.ttest_ind
'sconfidence_interval
method (that is, a wrapper likelambda x, y, *args, **kwargs: stats.ttest_ind(x, y, *args, **kwargs).confidence_level()
through the_axis_nan_policy
suite of tests, but I got some failures withnan_policy='propagate'
. I think this is a problem withTTestResult
, which is already inmain
, not this PR, because I get the same sorts of failures withttest_rel
.