-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: increase transparency of background dataset sub-sampling #3461
Comments
Thanks for the report and your effort to investigate this. Your description is absolutely accurate and the reason for this is the default in the tabular masker. Here is an issue where this problem was already discussed including workaround: #3174. We probably should throw at least a warning if |
I agree with your analysis, this seems to be a consequence of sampling. I'll remove the
I'm not sure if I agree. To me, warnings are generally used to indicate undesirable situations in which the user should probably update their code to fix the warning. In this case I think for the majority of users the subsampling is expected and desirable behaviour. Many parts of shap are sampling-based and only offer approximate results. Would |
|
I would much prefer logging over print statements, as prints are much harder to configure and disable. I think adding a print would risk annoying a large majority of shap users. I've renamed the title accordingly to reflect the plan. |
I am also confused about the background dataset and would like to ask a follow-up question, if I may. Suppose I use shap.TreeExplainer to explain predictions from my LightGBM model for a classification task. I am interested in model_output = "probability", so according to the documentation, I need to set feature_perturbation="interventional" and specify a background dataset. Given that I have training data, validation data, and test data, where should I pick the background dataset from - training, validation, or test? It says in the documentation that "Anywhere from 100 to 1000 random background samples are good sizes to use", how should I pick the samples? Should I fix the random samples so that the background dataset won't change regardless of the dataset (train, validation, test) I use? |
This is not strictly on topic, so if you have follow up questions to my answer please open a discussion or search for one of the topics where this is already discussed. First, I do not believe there is a real answer to your question, there is no real backtesting one can do for shap values, etc. So one just has to take various considerations into account:
|
Issue Description
Given$x$ which is the sample that we wish to explain, we can compute the Shapley values of that sample using a background sample $x^b$ . By providing the Explainer class with background data, it should compute the Shapley values for each sample in the background the background data and then take the average, which will be an approximation to the interventional SHAP.
The averaging procedure means that if I for example split my background data in half, A and B, then I should be able to call explainer on both A and B to obtain the averaged SHAP, a and b, for each half. If I now take (a + b)/2, then this should equal calling SHAP on the entire dataset to begin with.
From my experimentation, it seems that if the background dataset is over 100 samples, then it becomes inconsistent i.e. (a+b)/2 is not equal to the interventional approximation on the entire background dataset. However, the formula holds for datasets under 100 samples.
Minimal Reproducible Example
Traceback
No response
Expected Behavior
In the for loop the values should equal each other
Bug report checklist
Installed Versions
0.44.0
The text was updated successfully, but these errors were encountered: