Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: stats.ttest_ind: add degrees of freedom and confidence interval #18210

Merged
merged 9 commits into from
Apr 7, 2023

Conversation

mdhaber
Copy link
Contributor

@mdhaber mdhaber commented Mar 29, 2023

Reference issue

Closes gh-15906
Followup of gh-16902, gh-16835

What does this implement/fix?

This adds a df attribute and confidence_interval method to the result object returned by ttest_ind.
It also wraps ttest_ind with the _axis_nan_policy decorator, adding support for masked arrays and axis tuples, cleaning up edge cases, etc.

Additional information

Adding df and confidence_interval for a permutation t-test is possible but not in scope. Theoretically, adding df is easy because it is the same as for a regular t-test, but because the calculation of df is buried deep within a different branch when doing a permutation t-test, returning it without recomputing it would require a lot of plumbing. Rather than burden the reviewer with additional complexity, I've documented the limitation. Adding a permutation test confidence interval is a valid feature request but best left up to the gh-18067 effort.

While writing this PR, I found two potential bugs. Investigating/fixing these is not in scope of this PR because they seem to be bugs in main (but I will open issues to track them once I get a better understanding of them):

  • Our equal_var=True trimmed t-test with unequal sample sizes does not seem to match the results of the R multicon library yuenContrast function. More specifically, the degrees of freedom added by this PR do match, but the statistic and p-value that we already report in main don't match. I've xfailed the relevant test. Update: I'm no longer concerned by this. See ENH: support trimming in ttest_ind #13696 (comment).
  • I tried to run ttest_ind's confidence_interval method (that is, a wrapper like lambda x, y, *args, **kwargs: stats.ttest_ind(x, y, *args, **kwargs).confidence_level() through the _axis_nan_policy suite of tests, but I got some failures with nan_policy='propagate'. I think this is a problem with TTestResult, which is already in main, not this PR, because I get the same sorts of failures with ttest_rel. Update: fixed by MAINT: stats.TTestResult: fix NaN bug in ttest confidence intervals #18222 and successfully added test here.

@mdhaber mdhaber added scipy.stats enhancement A new feature or improvement labels Mar 29, 2023
Copy link
Contributor Author

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some self-review to help the reviewer.

@@ -7118,50 +7118,8 @@ def ttest_ind_from_stats(mean1, std1, nobs1, mean2, std2, nobs2,
return Ttest_indResult(*res)


def _ttest_nans(a, b, axis, namedtuple_type):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer used, and the _axis_nan_policy approach makes it obsolete, so it can be removed. TBH, it looks like _broadcast_shapes_with_dropped_axis and _shape_with_dropped_axis can be removed, too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Feel free to remove those.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed _broadcast_shapes and replaced it with np.broadcast_shapes, too. (np.broadcast_shapes was new in NumPy 1.20.0, so we didn't have that to work with.)

Comment on lines 7395 to 7411
a, b, axis = _chk2_asarray(a, b, axis)

# check both a and b
cna, npa = _contains_nan(a, nan_policy)
cnb, npb = _contains_nan(b, nan_policy)
contains_nan = cna or cnb
if npa == 'omit' or npb == 'omit':
nan_policy = 'omit'

if contains_nan and nan_policy == 'omit':
if permutations or trim != 0:
raise ValueError("nan-containing/masked inputs with "
"nan_policy='omit' are currently not "
"supported by permutation tests or "
"trimmed tests.")
a = ma.masked_invalid(a)
b = ma.masked_invalid(b)
return mstats_basic.ttest_ind(a, b, axis, equal_var, alternative)

if a.size == 0 or b.size == 0:
return _ttest_nans(a, b, axis, Ttest_indResult)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer needed due to the addition of the _axis_nan_policy decorator.

Comment on lines +7417 to +7418
# when nan_policy='omit', `df` can be different for different axis-slices
df = np.broadcast_to(df, t.shape)[()]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copied from ttest_rel. If it was needed there, it is needed here.

@@ -4804,7 +4804,7 @@ def test_ttest_ind_permutation_check_inputs(self):
stats.ttest_ind(self.a2, self.b2, permutations=1.5)
with assert_raises(ValueError, match="'hello' cannot be used"):
stats.ttest_ind(self.a, self.b, permutations=1,
random_state='hello')
random_state='hello', axis=1)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default axis=0 is invalid for self.a and self.b, but the errors were raised in a different order before, so we didn't notice.

# 0.3873582, 0.35187468, 0.21731811)
# yuen.t.test(a, b, tr=0, conf.level = 0.9, alternative = 'l')
#
# equal_var=True reference values computed with R multicon yuenContrast:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, forgot to include this:

Suggested change
# equal_var=True reference values computed with R multicon yuenContrast:
# equal_var=True reference values computed with R multicon yuenContrast:
#
# library(multicon)
# options(digits=16)
# a < - c(0.88236329, 0.97318744, 0.4549262, 0.97893335, 0.0606677,
# 0.44013366, 0.55806018, 0.40151434, 0.14453315, 0.25860601,
# 0.20202162)
# b < - c(0.93455277, 0.42680603, 0.49751939, 0.14152846, 0.711435,
# 0.77669667, 0.20507578, 0.78702772, 0.94691855, 0.32464958,
# 0.3873582, 0.35187468, 0.21731811)
# dv = c(a, b)
# iv = c(rep('a', length(a)), rep('b', length(b)))
# yuenContrast(dv~iv, EQVAR = FALSE, alternative = 'unequal', tr = 0.2)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done even though it's not recognizing as outdated.

assert_allclose(res.statistic, statistic)
assert_allclose(res.df, df)
assert_allclose(res.pvalue, pvalue)
if not equal_var:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if not equal_var:
if not equal_var: # CI not available when `equal_var is True`

Comment on lines +5041 to +5042
r = np.empty(shape=(3, 2, 2, 5))
r[0, 0, 0] = [-0.2314607, 19.894435, 0.8193209, -0.247220294, 0.188729943]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't set out thinking that I would populate the array this way. r started out as a dictionary, and I went through a lot of iterations to figure out how to store the data compactly within the PEP8 line limit. If the reviewer would prefer for this to be refactored, please make a specific suggestion, but I would prefer to do so under the following constraints:

  • One line per case
  • Include sufficient precision to use default assert_allclose tolerances.

One thought it that I could eliminate the statistic and p-value, since in theory those are already validated elsewhere. But I done that, I would not have caught the equal_var discrepancy, so I think it's good to leave them in.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The formatting looks good to me, especially given the comment above about what each slice of the array is nad how it's computed.

@@ -5023,6 +5020,64 @@ def test_trim_bounds_error(self, trim):
stats.ttest_ind([1, 2], [2, 1], trim=trim)


class Test_ttest_CI:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In retrospect, there was not really a need for this to be in a separate class. If the reviewer wants to see this elsewhere, let me know where.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at other tests, it seems like the precedent is to create a new class for testing a new argument. So, I don't think we need to change this. Maybe later we can merge all these classes to avoid create a new one everytime something new is added.

@mdhaber
Copy link
Contributor Author

mdhaber commented Apr 3, 2023

CI failures are unrelated.

Copy link
Member

@tirthasheshpatel tirthasheshpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @mdhaber! Looks incredibly clean, didn't find anything else other than your comments and suggestions. I also confirmed all the reference values in the test with R. Feel free to add the commits you have suggested and we can merge!

@tirthasheshpatel
Copy link
Member

Ah, more of these:

image

Let's just resolve them here.

scipy/stats/_stats_py.py Outdated Show resolved Hide resolved
scipy/stats/_stats_py.py Outdated Show resolved Hide resolved
scipy/stats/_stats_py.py Show resolved Hide resolved
scipy/stats/_stats_py.py Outdated Show resolved Hide resolved
Copy link
Member

@tirthasheshpatel tirthasheshpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now! Thanks @mdhaber!

@tirthasheshpatel tirthasheshpatel merged commit 98c51a6 into scipy:main Apr 7, 2023
24 checks passed
doronbehar added a commit to doronbehar/scipy that referenced this pull request Apr 8, 2023
* main: (51 commits)
  ENH: stats.ttest_ind: add degrees of freedom and confidence interval (scipy#18210)
  DOC fix link in release notes v1.7 (scipy#18258)
  BLD: fix missing build dependency on cython signature .txt files
  DOC: Clarify executable tutorials and MyST/Jupytext
  ENH: stats: Add relativistic Breit-Wigner Distribution (scipy#17505)
  MAINT: setup.sh as executable file
  DOC: remove content related to `setup.py` usage from the docs (scipy#18245)
  DOC: orthogonal_procrustes fix date of reference paper and DOI (scipy#18251)
  BLD: implement version check for minimum Cython version
  ci: touch up wheel build action
  DOC: link to info about Gitpod.
  ENH: ndimage.gaussian_filter: add `axes` keyword-only argument (scipy#18016)
  BUG: sparse: Fix LIL full-matrix assignment (scipy#18211)
  STY: Remove unnecessary semicolon
  DOC: Pin docutils and filter additional sphinx warnings
  BUG: vq.kmeans() compares signed diff to a threshold. (scipy#8727)
  MAINT: stats: consistently return NumPy numbers (scipy#18217)
  TST: stats.dunnett: fix seed and filter warnings in test_shapes
  DOC: add info to use codespaces.
  MAINT: add devcontainer configuration
  ...
@mdhaber mdhaber added this to the 1.11.0 milestone May 27, 2023
@mdhaber mdhaber mentioned this pull request May 27, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing degree of freedom parameter in return value from stats.ttest_ind
2 participants