Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial support for multi-output regression. #7309

Closed
wants to merge 1 commit into from

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Oct 11, 2021

  • Implemented as 1 model per target.
  • Not available for classification yet.

I will follow up with R and dask implementations if this PR is approved.

Related: #2087 .

Todos:

  • Verify data types other than numpy array.
  • Have better specification for return shape.

@trivialfis
Copy link
Member Author

Hmm .. datatable is updated?

@trivialfis trivialfis force-pushed the multi-target-trees branch 2 times, most recently from 4cac240 to c890e0f Compare October 11, 2021 20:35
@Craigacp
Copy link
Contributor

That's exciting. We'd be interested in a XGBoost4J wrapper too, and I might be able to write such a thing once you're happy with the native API. We currently fake this up by training a separate Booster for each output dimension.

@trivialfis
Copy link
Member Author

trivialfis commented Oct 12, 2021

Let me try to extend it to multi-class.

Update: Will implement it in a different PR. We need to extend the objective function and have more tests around the behavior of prediction. This will push XGBoost's tree grid into a cubic and needs some refactoring. But the interface established in this PR should be the same (num_target parameter).

@trivialfis
Copy link
Member Author

trivialfis commented Oct 12, 2021

@Craigacp Thanks for the support. I will keep you posted on the progress. I think it's best we wait until having a good solution for multi-class and other possible applications (like prob-forecasting). I will work on it in this release cycle.

@trivialfis
Copy link
Member Author

Should I create a multi-output regressor and distinguish it from the normal regressor?

@trivialfis
Copy link
Member Author

Need to have some utilities from #7331 first.

Add multi-output regressor.

Change tests/demo

Specialize multi output reg.

Fixes.
@trivialfis
Copy link
Member Author

Cross linking #7083 . Need to resolve the same issue with matrix meta info.

@trivialfis
Copy link
Member Author

trivialfis commented Nov 2, 2021

TO-DO for initial support of multi-target model training.

Tests:

  • Boost from prediction.
  • Weighted metrics.
  • Accuracy.

@trivialfis trivialfis moved this from 1.6 TO DO to 1.6 In Progress in 2.0 Roadmap Nov 3, 2021
@trivialfis
Copy link
Member Author

Most of the basic infrastructures are in place now. Need more thought on classification problem due to different number of classes per target.

@trivialfis
Copy link
Member Author

The initial support is merged in #7514 .

@trivialfis
Copy link
Member Author

trivialfis commented Dec 18, 2021

@Craigacp I think for multi-target classification models, xgboost needs a new interface and potentially lots of refactoring. I will focus on regression for now. Thank you for joining the discussion and feel free to test the new feature. ;-)

@aniruddhghatpande
Copy link

@trivialfis Is this contain support for all multi output regression ? @Craigacp were you able to write the wrapper for XGBoost4J ?

@Craigacp
Copy link
Contributor

I haven't looked into it. I've had less time to work on XGBoost related things for the past couple of years.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants