Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage jumps to 50G when trying to predict #6659

Closed
honzasterba opened this issue Jan 29, 2021 · 6 comments · Fixed by #7255
Closed

Memory usage jumps to 50G when trying to predict #6659

honzasterba opened this issue Jan 29, 2021 · 6 comments · Fixed by #7255

Comments

@honzasterba
Copy link
Contributor

I have a fairly small booster and data-set, when trying to do prediction on this dataset the memory usage jumps to 50GB.
Here is the code to reproduce:

import xgboost as xgb
dtest = xgb.DMatrix("dmatrix.bin")
bst1 = xgb.Booster() 
bst1.load_model('booster.bin')
ypred_h1 = bst1.predict(dtest)

Data used to reproduce attached.

data.zip

@trivialfis
Copy link
Member

Just confirming the data loading is correct, your data has 3781180 columns?

@honzasterba
Copy link
Contributor Author

the original training dataset has less than 100 columns, but there are some high cardinality categoricals which due to 1-hot encoding lead to this many columns in the xgboost training set
also should be noted that this is a regression since 1.3.0, with 1.2.0 I did not see this memory spike

@trivialfis
Copy link
Member

trivialfis commented Jan 29, 2021

@ShvetsKS Would you like to help taking a look? I think the thread optimization spikes up the memory usage. A better way to handle this might be putting some thoughts on extreme sparse dataset.

Right now you can try setting nthread to 1 explicitly, or use GPU predictor.

As a side note, #6503 should help removing the 1-hot encoding.

@trivialfis
Copy link
Member

trivialfis commented Jan 29, 2021

bst.set_param({"nthread": 1})
# or if you have a gpu at hand
bst.set_param({"predictor": "gpu_predictor"})

@honzasterba
Copy link
Contributor Author

setting nthread to 1 helped to work around the issue

@ShvetsKS
Copy link
Contributor

@trivialfis memory usage was increased as currently we process kBlockOfRowsSize observations per each tree to keep cache locality (1 observation was processed before).
I think there are at least tree possible options:

  1. add possibility to change kBlockOfRowsSize via user provided parameters, and set different from default 64 value
  2. implement automatic L1/L2 cache fitting (for current example kBlockOfRowsSize would be equal to 1)
  3. and as you proposed better way to handle extra sparse datasets with specific prediction implementation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants