Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python 3.12 xgboost.core.XGBoostError: Invalid Parameter format for nthread expect int but value='-1' when DMatrix used with import googlecloudprofiler. #10224

Open
rtb-zla-karma opened this issue Apr 25, 2024 · 6 comments

Comments

@rtb-zla-karma
Copy link

Hi

I have a very peculiar error which happened when I've updated versions of Python and libs in project I'm working on.

Minimal example to reproduce the case is this:

# file.py
import googlecloudprofiler
from xgboost import DMatrix

DMatrix([[]])
print("works")
# requirements.txt
xgboost==2.0.3
google-cloud-profiler==4.1.0
#
numpy==1.26.4
scipy==1.13.0
google-api-python-client==2.125.0
google-auth==2.29.0
google-auth-httplib2==0.2.0
protobuf==4.25.3
requests==2.31.0
#
cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
google-api-core==2.18.0
httplib2==0.22.0
idna==3.6
pyasn1==0.6.0
pyasn1_modules==0.4.0
pyparsing==3.1.2
rsa==4.9
uritemplate==4.1.1
urllib3==2.2.1

Python 3.12.2

Install with

pip install -r requirements.txt --no-deps

Run with

python file.py

Results in

Traceback (most recent call last):
  File "/project/path/file.py", line 4, in <module>
    DMatrix([[]])
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 730, in inner_f
    return func(**kwargs)
           ^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 857, in __init__
    handle, feature_names, feature_types = dispatch_data_backend(
                                           ^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 1081, in dispatch_data_backend
    return _from_list(data, missing, threads, feature_names, feature_types)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 1011, in _from_list
    return _from_numpy_array(array, missing, n_threads, feature_names, feature_types)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/data.py", line 207, in _from_numpy_array
    _check_call(
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 282, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Parameter format for nthread expect int but value='-1'

To "solve" the problem remove import googlecloudprofiler from file.py. I really have no idea why just importing the lib causes this problem; it would make more sense after googlecloudprofiler.start is called.

I'm also going to check and post if downgrading any of main requirements, which are xgboost and google-cloud-profiler, will fix this issue. Not today though; tired.

@trivialfis
Copy link
Member

It's probably Python 3.12 with google profiler. I looked into it a little bit, loading the _profiler.cpython-312-x86_64-linux-gnu.so inside google profiler extension causes the error. For me, it's actually a segfault during the construction of DMatrix due to an invalid write inside C++ standard function:

    #0 0x7e270a2bb8f4 in std::codecvt<wchar_t, char, __mbstate_t>::do_unshift(__mbstate_t&, char*, char*, char*&) const (/lib/x86_64-linux-gnu/libstdc++.so.6+0xbb8f4)

My guess is that it's compiled with an incompatible compiler, which can be resolved by a source installation. I don't think I can go deeper than this at the moment.

@hcho3
Copy link
Collaborator

hcho3 commented Apr 25, 2024

It would be interesting to see if the error persists after installing the identical set of packages in a Conda environment. (Conda packages use the same C++ compiler)

@trivialfis
Copy link
Member

the profiler is not on forge yet. Otherwise I would like to do a quick test.

@rtb-zla-karma
Copy link
Author

Hi again.

I've tested for older lib versions we have used in Python 3.10. I have run:

pip install xgboost==1.7.4 google-cloud-profiler==4.1.0

I couldn't downgrade to google-cloud-profiler==4.0.0 because it failed to build for Python 3.12.

Anyway, code surprisingly worked with xgboost==1.7.4. Output from pip freeze:

cachetools==5.3.3
certifi==2024.2.2
charset-normalizer==3.3.2
google-api-core==2.18.0
google-api-python-client==2.127.0
google-auth==2.29.0
google-auth-httplib2==0.2.0
google-cloud-profiler==4.1.0
googleapis-common-protos==1.63.0
httplib2==0.22.0
idna==3.7
numpy==1.26.4
proto-plus==1.23.0
protobuf==4.25.3
pyasn1==0.6.0
pyasn1_modules==0.4.0
pyparsing==3.1.2
requests==2.31.0
rsa==4.9
scipy==1.13.0
uritemplate==4.1.1
urllib3==2.2.1
xgboost==1.7.4

I've also tested it for in-between versions of xgboost and the highest version the code worked with is 1.7.6 and it started failing since 2.0.0.

@trivialfis
Copy link
Member

I couldn't downgrade to google-cloud-profiler==4.0.0 because it failed to build for Python 3.12.

I suggest opening an issue for google cloud profiler to fix this first. ;-)

@rtb-zla-karma
Copy link
Author

rtb-zla-karma commented Apr 26, 2024

It's still bad TBH. After downgrading I get another error later in the project code when initialized DMatrix is used for predictions using model. I won't paste the code just the exception from minimal example:

Traceback (most recent call last):
  File "/project/path/file2.py", line 19, in <module>
    main()
  File "/project/path/file2.py", line 16, in main
    model.predict(matrix)[0].item()
    ^^^^^^^^^^^^^^^^^^^^^
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 2163, in predict
    _check_call(
  File "/venv/path/lib/python3.12/site-packages/xgboost/core.py", line 279, in _check_call
    raise XGBoostError(py_str(_LIB.XGBGetLastError()))
xgboost.core.XGBoostError: Invalid Parameter format for disable_default_eval_metric expect boolean but value=''

Again, the solution is to remove import googlecloudprofiler. It seems that is have to remove this lib altogether.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants