New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduced-rank regression #10796
Comments
I can share some python code with you if you want.
|
I have code myself, I wanted to try my hands at working for sklearn on a simple project :)
|
Please do ;) |
@kingjr I can't find my own function right now (I can look deeper if you need it), but it's easy to implement, see e.g. here: https://github.com/riscy/machine_learning_linear_models/blob/master/reduced_rank_regressor.py |
It should be noted that this implementation just computes a low rank approximation of the coefficient matrix. It's not the algorithm given in Mukherjee and Zhu: http://dept.stat.lsa.umich.edu/~jizhu/pubs/Mukherjee-SADM11.pdf |
Here's my attempt: |
I am surprised there is no official release from scikit-learn |
Seems this issue lost steam. I am working on a new feature branch for this now. |
@mdmelin I don't think the method you're implementing is going to find the optimal coefficients. |
@krey Interesting, why do you say that? This implements the method described in Reinsel and Velu 1998, and it seems to be working as expected to me (when the rank of the regression is equivalent to the rank of Y, performance is perfect for non-noisy data. This method has been implemented in several other papers as well. |
@mdmelin You could compare it to Mukherjee and Zhu (see my implementation above). Use noisy data and compare the MSE of the two estimators. My intuition is that you suffer estimation error when fitting the PCA to Y, and then again when fitting the regression. Whereas the other method does it in a single pass. |
For the PCA based method, I wonder if you could do it with https://scikit-learn.org/stable/modules/generated/sklearn.compose.TransformedTargetRegressor.html |
Reduced-rank regression seems like a simple and fairly well-established technique. It could be implemented minimally by adding a
rank={None, int}
kwarg to Ridge with the svd solver.The text was updated successfully, but these errors were encountered: