Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GBM Prediction Interval #4210

Closed
AlexanderKUA opened this issue Feb 6, 2015 · 6 comments
Closed

GBM Prediction Interval #4210

AlexanderKUA opened this issue Feb 6, 2015 · 6 comments

Comments

@AlexanderKUA
Copy link

I need to get prediction interval for GBM model (loss='ls'). I'm using this example as a basis

http://scikit-learn.org/stable/auto_examples/ensemble/plot_gradient_boosting_quantile.html

My model parameters are chosen by GridSearch, than I'm training with the same params 2 models for lower and higher bounds.
But my Prediction curve sometimes is outside of Prediction interval. It looks odd. How can I solve such issue?

@amueller
Copy link
Member

amueller commented Feb 6, 2015

ping @pprett

@raghavrv
Copy link
Member

Could you post a minimal code ? If its too long use gists

@AlexanderKUA
Copy link
Author

No. Unfortunately it's part of complex system. I'll try to find equivalent example.

@pprett
Copy link
Member

pprett commented Feb 10, 2015

@AlexanderKUA you are saying that your loss='ls' model is outside the prediction interval? There might be multiple reasons: a) model is a poor approximation for the quanitles/mean or b) your data is heavily skewed, then the mean might be indeed outside the prediction interval. Can you show a histogram of your target variable?

@AlexanderKUA
Copy link
Author

I made histograms(and examples of appropriate prediction intervals) of my target variable (please take into account that it is in range [0, 100]).

case #2
case_2_interval_truncated
case2

case #3
case_3_interval_truncated
case3

case #6
case_6_interval_truncated
case6

As you can see with case6(last) everything looks ok, but target variable distribution is also skewed as in previous(case2, case3).

Thanks in advance

@lorentzenchr
Copy link
Member

One possibility is to estimate quantiles conditional on X, implemented in #924. While this won't give true prediction intervals as the estimation error is not accounted for, it gives at least some sort of uncertainty.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants