Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
This PR is meant the end the confusion around best_ntree_limit and unify model slicing. We have multi-class and random forests, asking users to understand how to set ntree_limit is difficult and error prone. * Implement the save_best option in early stopping. Co-authored-by: Philip Hyunsu Cho <chohyu01@cs.washington.edu>
- Loading branch information
1 parent
29745c6
commit 2cc9662
Showing
19 changed files
with
550 additions
and
37 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,38 @@ | ||
##### | ||
Model | ||
##### | ||
|
||
Slice tree model | ||
---------------- | ||
|
||
When ``booster`` is set to ``gbtree`` or ``dart``, XGBoost builds a tree model, which is a | ||
list of trees and can be sliced into multiple sub-models. | ||
|
||
.. code-block:: python | ||
from sklearn.datasets import make_classification | ||
num_classes = 3 | ||
X, y = make_classification(n_samples=1000, n_informative=5, | ||
n_classes=num_classes) | ||
dtrain = xgb.DMatrix(data=X, label=y) | ||
num_parallel_tree = 4 | ||
num_boost_round = 16 | ||
# total number of built trees is num_parallel_tree * num_classes * num_boost_round | ||
# We build a boosted random forest for classification here. | ||
booster = xgb.train({ | ||
'num_parallel_tree': 4, 'subsample': 0.5, 'num_class': 3}, | ||
num_boost_round=num_boost_round, dtrain=dtrain) | ||
# This is the sliced model, containing [3, 7) forests | ||
# step is also supported with some limitations like negative step is invalid. | ||
sliced: xgb.Booster = booster[3:7] | ||
# Access individual tree layer | ||
trees = [_ for _ in booster] | ||
assert len(trees) == num_boost_round | ||
The sliced model is a copy of selected trees, that means the model itself is immutable | ||
during slicing. This feature is the basis of `save_best` option in early stopping | ||
callback. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.