Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing incorrect model parse when XGB buf starts with 'binff' or 'binfn' #2162

Merged
merged 8 commits into from Jun 15, 2022

Conversation

TheZL
Copy link
Contributor

@TheZL TheZL commented Sep 2, 2021

In XGBoost 1.1, .save_raw() output added a prefix 'binf' to the model buffer.

#1220 addressed this by using lstrip to remove 'binf'. However, this has unintended consequence of stripping 'binff', 'binfn', 'binfb', 'binfi' as well, which occassionally occur as valid starts to a buffer.

The fix here is to check for exactly 'binf', and move the buffer forward by 4 if the prefix is exactly 'binf'.

@TheZL TheZL changed the title remove lstrip(b'binf') from XGBTreeModelLoader Fixing incorrect model parse when XGB buf starts with 'binff' or 'binfn' Sep 7, 2021
@TheZL
Copy link
Contributor Author

TheZL commented Sep 7, 2021

Hi @slundberg, I believe this PR fixes (#1864), and addresses the root cause which stems from unintended issues from #1220. Please update me when you can, thank you!

Copy link
Collaborator

@lrjball lrjball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well spotted @TheZL ! This bug has crept in because .lstrip looks for any characters in the provided string and keeps stripping until a character not provided, whereas what we want here is more the behaviour of .removeprefix, although that was only added in 3.9 so can't be used here as shap supports other python versions.

Maybe this could be written slightly more succinctly as something like

self.buf = xgb_model.save_raw()
if self.buf.startswith(b'binf'):
    self.buf = self.buf[4:]

Have you been able to find an example of a model where the raw dump starts with 'binff' of something? If there is an easy example then it would be good to add that as a test case, but if not then this should be merged either way as this is clearly better than the current bug.

@slundberg , thoughts?

@TheZL
Copy link
Contributor Author

TheZL commented Dec 30, 2021

Hi @Irjball, thanks for the reply! I agree with your opinion on this issue.
In our data example, we found that the utf-8 error occurred when the base score in XGBoost model was set to some specific values, e.g., base_score = 29.709193548387095. For testing the code, we could run a script like this:

from xgboost import XGBRegressor
import numpy as np
import shap
# Create input data
X = np.array([[1,2,3,4,5],[3,3,3,2,4]])
y = np.array([1,0])
# Fit XGBoost model with base score equals to 29.709193548387095
model = XGBRegressor(base_score=29.709193548387095)
model.fit(X,y,eval_metric="rmse")
explainer = shap.TreeExplainer(model)
shap_result = explainer.shap_values(X)
# Examine the buff
buf = model.get_booster().save_raw()[0:10]
print(buf)

The committed code on this branch works well with the example. I also tried your suggested code, but I got an error for it:
AttributeError: 'bytearray' object has no attribute 'startwith'
Maybe we need to convert bytearray to string first?
Thanks again for your thoughtful feedback!

@lrjball
Copy link
Collaborator

lrjball commented Dec 30, 2021

Looks like that is just a typo, it should be startswith, not startwith.

Good one on the example though, I've tried a few others and e.g. base_score=1.3 seems to be another example. If you are comfortable with pytest, it would be worth adding a test function to tests/explainers/test_tree.py to check that the provided example model can be loaded properly when using XGBTreeModelLoader, and that the params are as expected. That way if any changes are made to the code in future, we can make sure this issue doesn't happen again! Alternatively I can add the test function in, let me know what you prefer.

@TheZL
Copy link
Contributor Author

TheZL commented Dec 31, 2021

@lrjball Yeah, you are right. The code works well with "startswith". I have updated the code accordingly on this branch.
If you don't mind, could you please add the test function in? I understand that it is important to add the test example for future changes, but I am not quite familiar with pytest yet.
Thank you for your time and suggestions!

@TheZL
Copy link
Contributor Author

TheZL commented Feb 7, 2022

Hi @lrjball, Just want to follow up with the progress on this issue. Will the bug be fixed in the next released version of shap? We are using the shap package for an application study. Currently we manually adjust the input data when the error occurs. If this bug could be fixed in the next release, that will be very helpful. Thanks!

@lrjball
Copy link
Collaborator

lrjball commented Feb 28, 2022

Hi @TheZL, thanks for your patience. I have added a test for your update and raised a PR to your branch (TheZL#1). If you could merge that PR then this should be ready to go in.

…ction

Added test for buffer strip update
@TheZL
Copy link
Contributor Author

TheZL commented Mar 1, 2022

@lrjball Thanks for the help! The change has been merged.

@josis-silver
Copy link

Just wanted to add: I have the same problem with several datasets and models, it even seems to happen more frequently recently and for us changing the input data manually is not an option. If there is any chance that this fix could be merged soon, that would be awesome, thanks!

@slundberg
Copy link
Collaborator

Hey! Sorry to be out of the loop here for a while (was out after paternity leave for a while). The changes and test look great thanks @lrjball and @TheZL !

I am running a new CI pipeline to make sure things work as expected, then we can merge and get this is 0.41

@daviskirk
Copy link

I just checked the failing test on 3.8 because I thought I might be able to help push this over the finish line.
But as far as I can tell almost all of them are request failures that might just be because of random connection issues... perhaps they would just work on a rerun?

@codecov
Copy link

codecov bot commented Jun 15, 2022

Codecov Report

Merging #2162 (2cfa489) into master (854426a) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #2162   +/-   ##
=======================================
  Coverage   51.51%   51.52%           
=======================================
  Files          90       90           
  Lines       13116    13118    +2     
=======================================
+ Hits         6757     6759    +2     
  Misses       6359     6359           
Impacted Files Coverage Δ
shap/explainers/_tree.py 69.32% <100.00%> (+0.05%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@slundberg slundberg merged commit 4921c50 into shap:master Jun 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants