Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement categorical prediction for CPU and GPU predict leaf. #7001

Merged
merged 7 commits into from Jun 11, 2021

Conversation

trivialfis
Copy link
Member

@trivialfis trivialfis commented May 26, 2021

  • Implement categorical prediction for CPU prediction.
  • Implement categorical prediction for GPU predict leaf.
  • Refactor the prediction functions to have unified get next node.

Related: #6503 .

@trivialfis trivialfis mentioned this pull request May 26, 2021
67 tasks
@codecov-commenter
Copy link

codecov-commenter commented May 26, 2021

Codecov Report

Merging #7001 (21bb93d) into master (ee4f51a) will increase coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #7001   +/-   ##
=======================================
  Coverage   81.71%   81.72%           
=======================================
  Files          13       13           
  Lines        3916     3917    +1     
=======================================
+ Hits         3200     3201    +1     
  Misses        716      716           
Impacted Files Coverage Δ
python-package/xgboost/dask.py 81.35% <0.00%> (ø)
python-package/xgboost/sklearn.py 89.51% <0.00%> (+0.02%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ee4f51a...21bb93d. Read the comment docs.

* Implement categorical prediction for CPU prediction.
* Implement categorical prediction for GPU predict leaf.
* Refactor the prediction functions to have unified get next.
int tid = model.trees[j]->GetLeafIndex(feats);
auto const& tree = *model.trees[j];
auto const& cats = tree.GetCategoriesMatrix();
bst_node_t tid = GetLeafIndex<true, true>(tree, feats, cats);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a check for has_categorical here? Or are we purposefully removing it as GetLeaf is not performance critical?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also it's possible feats has no missing values

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I'm assuming it's not critical. Otherwise we will have a lot more specializations

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the previous default so I'm not slowing it down here. If optimization is needed (being the computation bottleneck of some algorithms/models), we can come back to it in a different PR that focuses on optimization.

@trivialfis trivialfis requested a review from hcho3 June 9, 2021 17:48
@trivialfis trivialfis merged commit f79cc4a into dmlc:master Jun 11, 2021
@trivialfis trivialfis deleted the cat-cpu-predictor branch June 11, 2021 02:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants