Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding unit test to cover ties/duplicate x values in Isotonic Regression... #4185

Conversation

mjbommar
Copy link
Contributor

Unit test to highlight regression in issue #4184

@mjbommar
Copy link
Contributor Author

Small note: Travis failures are intended, as this is meant to cover the issue highlighted in #4184.

@amueller amueller added the Bug label Jan 30, 2015
@amueller amueller added this to the 0.16 milestone Jan 30, 2015
@amueller
Copy link
Member

I think we should maybe hard-code what the expected result is. In master, neigher fit_transform nor fit.transform gives the/a correct one, right?

@mjbommar
Copy link
Contributor Author

Hi @amueller , yes, great suggestion. I used the R isotone package's examples from the Leeuw et al. paper in JSS as a base, and have committed expanded unit tests based on this.

  1. fit_transform matches the results of gpava(ties="primary"); fit, then transform does not.
  2. gpava() allows the user to set a method for tie-handling; should we dock to one of these options, e.g., "primary", until deciding if we should allow the user to specify in sklearn as well? See pp. 14 in the JSS paper linked above.


def test_isotonic_regression_ties_primary_fit_transform():
"""
Test isotonic regression fit_transform against the "primary" ties method
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 spaces after transform but besides LGTM if travis is happy :)

thanks @mjbommar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops, fixed now, thanks.

travis was happy other than these 3 failures:

======================================================================
FAIL: sklearn.tests.test_isotonic.test_isotonic_regression_ties_min
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/data/workspace/scikit-learn/sklearn/tests/test_isotonic.py", line 92, in test_isotonic_regression_ties_min
    assert_array_equal(ir.fit(x, y).transform(x), ir.fit_transform(x, y))
  File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 739, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 665, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

(mismatch 28.5714285714%)
 x: array([ 0.,  0.,  0.,  3.,  4.,  5.,  6.])
 y: array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.])

======================================================================
FAIL: sklearn.tests.test_isotonic.test_isotonic_regression_ties_max
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/data/workspace/scikit-learn/sklearn/tests/test_isotonic.py", line 103, in test_isotonic_regression_ties_max
    assert_array_equal(ir.fit(x, y).transform(x), ir.fit_transform(x, y))
  File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 739, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 665, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

(mismatch 33.3333333333%)
 x: array([ 1.,  2.,  3.,  4.,  0.,  0.])
 y: array([ 1.,  2.,  3.,  4.,  5.,  6.])

======================================================================
FAIL: Test isotonic regression fit, transform against the "primary" ties method
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/nose/case.py", line 197, in runTest
    self.test(*self.arg)
  File "/data/workspace/scikit-learn/sklearn/tests/test_isotonic.py", line 134, in test_isotonic_regression_ties_primary_fit
    assert_array_equal(ir.transform(x), y_true)
  File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 739, in assert_array_equal
    verbose=verbose, header='Arrays are not equal')
  File "/usr/local/lib/python2.7/dist-packages/numpy/testing/utils.py", line 665, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Arrays are not equal

(mismatch 100.0%)
 x: array([ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.])
 y: array([ 21.   ,  22.375,  22.375,  22.375,  22.375,  22.375,  22.375,
        22.375,  22.375,  23.5  ,  25.   ])

----------------------------------------------------------------------
Ran 24 tests in 0.041s

FAILED (failures=3)

@agramfort
Copy link
Member

travis still complains :(

@mjbommar
Copy link
Contributor Author

mjbommar commented Feb 1, 2015

@agramfort , the purpose of these unit tests is to highlight a current issue that exists in 0.15.2 and 0.16-dev. The three failures that are occurring in travis are intended to fail :)

assert_array_equal(ir.transform(x), y_true)


def test_isotonic_regression_ties_primary_fit_transform():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put the two in the same test, I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@amueller
Copy link
Member

amueller commented Feb 1, 2015

Maybe @NelleV has time to have a look. I'll have a look next week. Specifying the tie-breaking mechanism with as backward compatible a default as possible would be nice.
We might need to refactor a bit to fix #2507, too. (Currently multiple zero sample weights in a row give an infinite loop).

@agramfort
Copy link
Member

@agramfort , the purpose of these unit tests is to highlight a current issue that exists in 0.15.2 and 0.16-dev. The three failures that are occurring in travis are intended to fail :)

arrfff ... sorry for the noise.

maybe @fabianp can have a look too.

@GaelVaroquaux
Copy link
Member

Our policy is to not commit to master failing tests without the fix. The reason is to be able to always have master green. If master is not green, it has a psychological detrimental effect on quality.

@mjbommar
Copy link
Contributor Author

mjbommar commented Feb 1, 2015

@GaelVaroquaux , understood; just trying to help whoever picks up issue #4184.

@amueller
Copy link
Member

amueller commented Feb 3, 2015

I think we should add a ties= option and maybe just use a naive implementation of fit_predict, that is just remove the method. That would also make masking zero-sample weight points super easy.

@mjbommar
Copy link
Contributor Author

mjbommar commented Feb 3, 2015

@amueller, +1 from my perspective. In my experience, the "secondary" and "tertiary" options described in the JSS paper are more useful than "primary" given that "primary" does not necessarily produce bijective mappings.

Small chance I might be able to do this in the next week or two. Any opposition to reworking the cython source along these lines:
https://github.com/cran/isotone/blob/master/R/gpava.R#L36

@amueller
Copy link
Member

amueller commented Feb 3, 2015

I have no detailed knowledge and I think @NelleV or @agramfort might need to weight in.

@agramfort
Copy link
Member

no opposition

@NelleV
Copy link
Member

NelleV commented Feb 4, 2015

I think it is a good idea to have different ways to deal with ties. Go for it!

@mjbommar
Copy link
Contributor Author

mjbommar commented Feb 4, 2015

OK, here are some thoughts; want to make sure that my particular use cases are not leading me astray.

  1. From a syntax perspective, how do we distinguish between the case A, the monotonic optimization of a specific sample (X, Y) from case B, the usage of a monotonic regression function constructed from that optimization result? For example, fit_transform need not concern itself with an interpolant, but transform on out of sample data does (and predict is simply a call to transform, but we have no fit_predict, and thus fit_transform and fit_predict may not return the same result based on more on LARS (we're getting there) #2 below).
  2. For case B above (i.e., predict), the simplest thing to do is construct a piece-wise constant interpolant from the optimization of the specific sample used in fit. However, we've been using "linear" or "slinear" calls to interp1d, resulting in piece-wise linear interpolant; depending on what fit()'s X and y look like (i.e., whatself.X_and`self.y_`` are), these may be radically different or "broken" answers (as we are seeing for some tied values at left or right endpoint).
  3. My gut instinct at this point is we will "break" things further from a reverse compatibility perspective in order to "fix" these issues. How do we want to go about handling this? Another question might be - what is the regression? That fit_transform and fit -> transform are different, or that our syntax is not consistent with the concepts?

@amueller amueller mentioned this pull request Feb 4, 2015
5 tasks
@mjbommar
Copy link
Contributor Author

@NelleV or @agramfort, if you have any thoughts about the question above, I would have some time to put towards a fix this week.

@agramfort
Copy link
Member

no bandwidth to look into it :(

@ogrisel
Copy link
Member

ogrisel commented Feb 26, 2015

I just ran those 3 tests against 0.15.2 and they also fail there: the first two tests (test_isotonic_regression_ties_min / max) yield the same failures while the later used to output nans which has been fixed in scikit-learn master:

https://gist.github.com/ogrisel/676f7d582600036efa60

So it seems that the test_isotonic_regression_ties_min / max tests do not properly qualify the regression you initially reported @mjbommar . Can you please confirm that your reproduce the same test failures on sklearn 0.15.2?

The last test, namely test_isotonic_regression_ties_primary_ highlights a bug in our code in my opinion: the output is all zeros on master instead of being in the range of 22 something... This is really weird.

@ogrisel
Copy link
Member

ogrisel commented Feb 26, 2015

Note that the fit_transform method returns the expected y_true. There is something wrong in our implementation of transform.

@amueller
Copy link
Member

Wait, when I last looked at it it was the other way around ... ?!

@amueller
Copy link
Member

Oh no, you are right, transform was the problem.

@amueller
Copy link
Member

I also want to highlight again the issue #2507 than can cause infinite loops:
#2507 (comment)

@ogrisel
Copy link
Member

ogrisel commented Feb 26, 2015

I also want to highlight again the issue #2507 than can cause infinite loops:
#2507 (comment)

I created a dedicated issue for this (#4297) as it does not seem to be the same bug as the original report in #2507.

@mjbommar
Copy link
Contributor Author

@ogrisel , the discussion here was unfortunately split between both issue #4184 and this PR. You can see that we had confirmed the failures in 0.15.2 as well in #4184 ; in other words, this is a regression that goes back some ways.

While it's been a few weeks since I spent time thinking about it, I believe my comment here is the best synopsis of the ways forward: #4185 (comment)

@ogrisel
Copy link
Member

ogrisel commented Feb 26, 2015

Thanks.

While it's been a few weeks since I spent time thinking about it, I believe my comment here is the best synopsis of the ways forward: #4185 (comment)

This does not explain the all zeros output I get when running test_isotonic_regression_ties_primary_ with slinear right? I still need to investigate further to understand.

@mjbommar
Copy link
Contributor Author

@ogrisel , agreed on 0's.

My line of inquiry led me to question what we meant by "expected" results in general. Our implementation is only a very narrow way of looking at the problem and may be too naive. These unit tests were meant to dock us to the corresponding R package released by the publication authors.

For example, why is "slinear" the default and not a piece-wise constant interpolant? Also, should transform and predict necessarily return the same result? I think we can agree that fit_transform and fit, transform should.

@amueller
Copy link
Member

Here is a notebook that illustrates everything that goes horribly wrong in interp1d:
http://nbviewer.ipython.org/gist/amueller/df9a8a7da67c1c4556cf

  1. when called at the exact input data, slinear gives zeros (looks like a scipy bug)
  2. on the lower tie, linear gives a NaN

I would be ok with having any valid tie-breaking strategy, but that doesn't seem to be reasonable.

@amueller
Copy link
Member

With just

x = [  8., 10.,  12.,  14.]
y = [ 21.   ,  22.375,  22.375,  23.5]

as input, everything looks good fyi.

@amueller
Copy link
Member

Proposed solution: report / ask scipy people, implement our own duplicate removal strategy.

@mjbommar I agree there is much room for different strategies. I think the sklearn developers don't have that many uses outside of calibration, where duplicate values are basically non-existent.
I also don't know about nearest vs linear interpolation. If you have insights, I think we are pretty open to improvements.
Actually, I am quite confused by the presence of both transform and predict.
I think this API is a bit weird, and I think the reason is that this is usually applied to the labels, not the data.

@amueller
Copy link
Member

@mjbommar did you already put any thought into how to implement the tie-breaking?

@mjbommar
Copy link
Contributor Author

@amueller , my perspective was that we should try to dock our approach to the publication authors'. While they implement three approaches in their CRAN package, we could simply pick one; some have a means of breaking ties in the input sample.

That said, my point above about the difference between transform and predict will remain. The sample optimization problem (i.e., fit_transform) is much simpler, as it doesn't involve a choice of interpolant, i.e., no need to use interp1d.

@amueller
Copy link
Member

I agree about using one of their approaches, I was more asking if you looked into implementing any of them. I'll have a closer look at the paper now...

@amueller
Copy link
Member

I think we need to implement the secondary approach. With the primary approach, we can not get fit_transform to be the same as transform. The same is true for the tertiary approach if I understand it correctly.

@amueller
Copy link
Member

Ugh, it looks like there is also an(other) issue with sample_weights. It looks like it is not reordered in https://github.com/scikit-learn/scikit-learn/blob/master/sklearn/isotonic.py#L253 ...

@amueller
Copy link
Member

@mjbommar what do you think about secondary tie-breaking as the default?

@amueller
Copy link
Member

Another question: What does it mean to predict on new data using the primary method? What would you predict for a point with ties?

@mjbommar
Copy link
Contributor Author

@amueller , I think choice of tie-breaking is tied to whether we support predict or not. Like you are seeing, if the resulting sequence is not one-to-one, then interpolants won't make sense.

Since the client uncovered this issue, my first approach had been to replace the interp1d interpolant with a left-hand piece-wise constant interpolant. This handles the ties OK. Even better from an application perspective was to take the mean/median of the tied values and replace into the original input (updating sample weight optionally).

It sounds like we are seeing the same issues now :)

@amueller
Copy link
Member

Ok. The secondary strategy actually takes the mean and supports predict. I think that is a reasonable choice.

@mjbommar
Copy link
Contributor Author

@amueller , yes, perfect.

I would still be happy to expand the methods to support other interpolants and primary/tertiary from gpava, but perhaps we can push that discussion off until the regressions are resolved.

@amueller
Copy link
Member

I agree, it would be nice to have, and you are welcome to work on it.

@ogrisel
Copy link
Member

ogrisel commented Feb 28, 2015

@mjbommar please avoid merging master into PR branch but instead squash old uninformative commits to cleanup the history and rebase on top of the current master.

@mjbommar
Copy link
Contributor Author

@ogrisel , sorry, but I did this since @amueller cherry-picked my commits for his fix PR #4302. I will close this PR for review and we can continue the conversation in the active PR, #4302.

@mjbommar mjbommar closed this Feb 28, 2015
@ogrisel
Copy link
Member

ogrisel commented Feb 28, 2015

Ok no pbm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants