Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XGBoost is much slower on Ryzen 7 3700X than on Core i5-1135G7 (with the same performance rating) #9689

Open
ibobak opened this issue Oct 18, 2023 · 3 comments

Comments

@ibobak
Copy link

ibobak commented Oct 18, 2023

I have XGBoost 2.0.0 installed on two machines:

  • one with 4-core processor Intel Core i5-1135G7
  • another - with 8-core processor AMD Ryzen 7 3700X
    Both CPUs have almost the same single thread rating (based on passmark website - see the links above), while multiple thread rating is more than twice better for Ryzen 7.

I am running the same code on the same data on both PCs. The code does hyperparameter search using Optuna, and it trains XGBoost model. Optuna measures the time for each single iteration, so that I could build a histogram of the time for model training, and this is what I see:

ksnip_20231018-144230

ksnip_20231018-144228

I was not a surprise for me that on both PCs Optuna could perform about 1500 operations: while there are twice more cores for Ryzen than for core i5, the speed of XGBoost training is twice slower, and as a result, we are getting the same number of iterations.

I tried to recompile XGBoost with different optimization flags under ryzen:

  • march=native, march=znver2
  • O3
  • flto
  • mavx2
  • mfma

But this all just doesn't help.

@trivialfis
Copy link
Member

There are some ad-hoc internal constants that can be tuned scattered around the code base. If you are interested in this, I can try to gather them into one place.

@ibobak
Copy link
Author

ibobak commented Oct 18, 2023

Yes, I'd gladly change those constants and try to recompile, and I will report the results here.

@trivialfis
Copy link
Member

trivialfis commented Oct 19, 2023

Please take a look at: #9694 tuning.h. Considering the AMD is known for stuffing large cache in their CPUs, you might want to increase some of the parameters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants