Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calculate base_score based on input labels. #8107

Merged
merged 34 commits into from Sep 20, 2022

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Jul 22, 2022

#4321 .

This PR calculates the base_score from labels for l1 regression and saves it to the output model. Will follow up on other objectives as well.

  • Configure model parameter for base_score.
  • Add estimation function in objective.
  • Change base_score to an array. At the moment, we use only 1 element, the change is to prepare for multi-class and multi-output once we can remove some legacy in the binary model.

Multi-target and multi-class are not yet supported due to the binary model parameter.

@trivialfis trivialfis marked this pull request as draft July 27, 2022 06:27
@trivialfis
Copy link
Member Author

The old binary format and the gpu_id configuration are probably too difficult to workaround for this PR.

@trivialfis trivialfis marked this pull request as ready for review August 23, 2022 09:01
@trivialfis
Copy link
Member Author

The old binary format and the gpu_id configuration are probably too difficult to workaround for this PR.

I workaround it by limiting the base_score to a single scalar for now.

// - model loaded from new binary or JSON.
// - model is created from scratch.
// - model is configured second time due to change of parameter
CHECK(obj_);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This configuration is very fragile.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. That's why I really want to remove the old model format.

src/learner.cc Outdated Show resolved Hide resolved
src/objective/objective.cc Show resolved Hide resolved
@trivialfis trivialfis marked this pull request as draft September 13, 2022 21:02
@trivialfis trivialfis marked this pull request as ready for review September 14, 2022 09:49
@trivialfis
Copy link
Member Author

I removed the use of nan as the base score flag to avoid breaking changes in downstream libraries (like treelite). Instead, a new base_score_estimated variable is introduced, but the variable is not read during model load so we don't need to keep it stable.

// average base score across all valid workers
rabit::Allreduce<rabit::op::Sum>(out.Values().data(), out.Values().size());
std::transform(linalg::cbegin(out), linalg::cend(out), linalg::begin(out),
[world](float v) { return v / world; });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better that it was before. I wonder if it can be more robust with weighted averaging. The MSE version will need to use a weighted average also. Small example:
Worker 0 labels: 0 0 0
Worker 1 labels: 1000
True median: 0
True median mean abs error: 250
Estimated median (current method): 500
Estimated median (current method) mean abs error: 500
Estimated median (weighted average): 250
Estimated median (weighted average) abs error: 375

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion, changed it to the weighted average. I have adapted your example into a Python test.

@trivialfis trivialfis merged commit fffb1fc into dmlc:master Sep 20, 2022
@trivialfis trivialfis deleted the init-estimation branch September 20, 2022 12:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants