Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expand categorical node. #6028

Merged
merged 12 commits into from Aug 25, 2020
Merged

Expand categorical node. #6028

merged 12 commits into from Aug 25, 2020

Conversation

trivialfis
Copy link
Member

This PR includes expanding tree on categorical feature, saving/loading such trees. It's partly extracted from #5949 , with changes that now the saved tree model contains actual categories instead of bitsets.

@trivialfis
Copy link
Member Author

The test is quite limited as there's no way to build categorical splits right now except for manually expanding nodes. So I will leave better tests for later PRs.

@codecov-commenter
Copy link

codecov-commenter commented Aug 18, 2020

Codecov Report

Merging #6028 into master will not change coverage.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #6028   +/-   ##
=======================================
  Coverage   79.10%   79.10%           
=======================================
  Files          12       12           
  Lines        3044     3044           
=======================================
  Hits         2408     2408           
  Misses        636      636           
Impacted Files Coverage Δ
python-package/xgboost/sklearn.py 91.44% <ø> (ø)
python-package/xgboost/core.py 78.07% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 29b7fea...096e8f7. Read the comment docs.

Copy link
Collaborator

@hcho3 hcho3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Some minor comments.

R-package/tests/testthat/test_callbacks.R Show resolved Hide resolved
src/common/bitfield.h Show resolved Hide resolved
src/tree/tree_model.cc Show resolved Hide resolved
Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
@trivialfis trivialfis merged commit 20c95be into dmlc:master Aug 25, 2020
@trivialfis trivialfis deleted the expand-cat branch August 25, 2020 10:54
@trivialfis
Copy link
Member Author

This PR brought significant impact on performance as it adds 1 extra vector for each node. I will work on another PR to restore the performance as I can't revert it cleanly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants