-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add CI and str to Dunnett's test #18150
Conversation
[skip circle]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we've discovered, testing with multiple problems is important when we're dealing with this noisy function and can't use tight tolerances. Please add a parameterized test over the four test cases we used for p-values.
Co-authored-by: Matt Haberland <mhaberla@calpoly.edu>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matt, I will add the parametrisation 👍
I've done some investigation of alternatives, and using Even better is a variant of the noisy bisection algorithm given in that blog post using a secant instead of pure bisection. One of the real benefits of this algorithm is that it is self-adaptive to the noise level of function and will converge as soon as the random noise violates the monotonicity constraint. This is a sensible place to stop, so it doesn't need extra checking after the fact. If a typical use of There are a few things that we can bring from the Bornkamp paper even if we don't implement the local-linear regression root-finder. I do have an implementation of it, and it can lower the error a fair bit (rigorously), but it does so at the cost of additional function evaluations, at least for the convergence tolerances used in the paper and One thing we can bring over is the use of the The second thing is to use the bracketing interval Bornkamp derives. This helps place put us in the right region rather than having to search for one using the defaults. Specifically, if we are looking for I am happy to clean up my investigation notebook if there is interest. |
@rkern thank you for investigating. What I could do here is to use For the rest, I would suggest leaving it for a follow up if you are interesting in doing that. |
Ok if I am not mistaking, I addressed all suggestions. |
I would not care about this lint error. It's a false positive because we name this differently in the docstrings. But if you want I can add a
|
Yes, definitely would be interested in a PR! The PPF transform idea and use of analytical bounds with an existing bracketing root finder sound like very simple improvements to this. I'd be curious to see how much the tolerances in the tests can be tightened with that alone (and I understand that it would also improve speed). After that, noisy bisection/secant ideas also sound good. I've thought before that |
OK, added the test. Please correct the one-sided confidence interval direction and adjust the tests (I flipped them so that I could check that the rest of the test was correct). Then I'll come back and compare the math against the reference and check the pretty-printing. |
scipy/stats/_multicomp.py
Outdated
@@ -24,10 +27,167 @@ | |||
class DunnettResult: | |||
statistic: np.ndarray | |||
pvalue: np.ndarray | |||
_alternative: Literal['two-sided', 'less', 'greater'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These private attributes are showing up in __repr__
. I'd suggest hiding them as in EmpiricalDistributionFunction
'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mmm are you sure? If we say __repr__
should allow to build back the object, then we would need (at least some of) the private attributes.
I don't have strong feelings on this one, I am not sure since we have a pretty __str__
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we say repr should allow to build back the object
I think of that as the ideal distinction between __repr__
and __str__
if there is to be one. I would prefer to break that rule than to expose private attributes, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me. I personally very rarely use this behaviour to recreate objects. And I am not sure people would want to do that in this case. We can try what you propose and can re-evaluate if we get user requests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can't re-create the object anyway because it's private. Er... doing so is not supported.
scipy/stats/_multicomp.py
Outdated
"\nOne-sided alternative (less): sample i's mean exceed " | ||
"the control's mean by an amount at least Lower CI" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure whether I'm a fan of this. R prints something like it, right? My concern is that it purports to interpret what a confidence interval is, but it doesn't really give the detail that is needed for a correct understanding. You could ask me to get over it, or we could wordsmith this to death, or we could just remove it : )
If we keep it, I would prefer "the difference between the mean of sample i
and the mean of the control is..." I think that the order of subtraction is unambiguous, but it is easier to parse a negative difference than a negative amount for "exceeds".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually used the wording from Dunnett's article. I am always struggling with what alternative means and thought it could help users to be "more confident" about what they are doing. I've seen something a bit similar in Stata (or something like that) where the output of a test is written in a plain text fashion. You could argue though that we are targeting more integrators than end users and this goes a bit beyond our scope.
I am fine with both removing or adjusting the messaging. Can you make a code suggestion (and you can push it as I am ok with both options here)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
His wording was fine, but you would then need to add a caveat like he does every time he makes statements like these: "The joint statement consisting of the ... conclusions has a confidence coefficient of 95%." Without this, it looks like a statement of 100% certainty. And I'm not sure if we want to write that caveat verbatim, so we would need to wordsmith that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Matt! I will make the change for the repr and CI in the doc.
scipy/stats/_multicomp.py
Outdated
@@ -24,10 +27,167 @@ | |||
class DunnettResult: | |||
statistic: np.ndarray | |||
pvalue: np.ndarray | |||
_alternative: Literal['two-sided', 'less', 'greater'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine by me. I personally very rarely use this behaviour to recreate objects. And I am not sure people would want to do that in this case. We can try what you propose and can re-evaluate if we get user requests.
scipy/stats/_multicomp.py
Outdated
"\nOne-sided alternative (less): sample i's mean exceed " | ||
"the control's mean by an amount at least Lower CI" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually used the wording from Dunnett's article. I am always struggling with what alternative means and thought it could help users to be "more confident" about what they are doing. I've seen something a bit similar in Stata (or something like that) where the output of a test is written in a plain text fashion. You could argue though that we are targeting more integrators than end users and this goes a bit beyond our scope.
I am fine with both removing or adjusting the messaging. Can you make a code suggestion (and you can push it as I am ok with both options here)?
[skip cirrus] [skip azp]
[skip actions] [skip azp] [skip circle]
[skip actions] [skip azp] [skip circle]
[skip actions] [skip azp] [skip circle]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With a few adjustments, this looks good. I'll push an update that implements the suggestions here, and when CI is finished I can merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tupui. Please add documentation of |
Thanks Matt, I will make the follow up. For the optimization part, I would set a lower priority vs other topics since the current solution is reasonable. |
Re: #18150 (comment) I got this working (at least for @rkern any chance that
is supposed to be
? That's how I got a valid bracket in my (sloppy) tests 6449a15. Could you point out the part of the paper where this is derived? What tolerance did you use with the bracketing rootfinders? (If you linked to your notebook, it got buried.) |
For what I wrote, There's no formal derivation written out per se, but one can think through the two extreme cases. When all dimensions are completely correlated, the In the completely independent case, the |
Ah ok, thanks. Makes sense. I'll play with that a bit more. |
But it would still be great if you'd submit a PR with those few changes (PPF transform and use of analytical bounds with an existing bracketing root). |
Follow up of #18107
Adds confidence interval around the mean based on the "allowance" and also provide a cleaner
str
.