Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move GHistIndex into DMatrix. #7064

Merged
merged 4 commits into from Jun 30, 2021
Merged

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented Jun 25, 2021

See #4354 for details. This PR moves GHistIndex, we also need to move column matrix in future PRs. Also related: #5143 , #6148

Few more bonus:

  • We can remove the is_gmat_initialized_ in hist builder.
  • We can implement external memory support if desired.
  • We can reuse it in approx tree method.

I will update with benchmarks later.

@SmirnovEgorRu @ShvetsKS

@trivialfis
Copy link
Member Author

trivialfis commented Jun 28, 2021

Here is a quick benchmark, I think there's no performance difference since I'm just moving the code around:

master PR
HIGGS 181.0964948719969 180.65688620665847
Covtype 70.7731420363416 70.98095073600416

The result is from running https://github.com/NVIDIA/gbm-bench 3 times for each dataset.

@ShvetsKS
Copy link
Contributor

@trivialfis sorry for long response.
I also remeasured benchmarks and didn't noticed any performance degradations.
Seems this PR simplifies the implementation of quantile cpu DMatrix :)

@trivialfis
Copy link
Member Author

Yeah, I'm planning categorical data support for CPU, hoping that I can do some cleanup first instead of piling new code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants