Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

partial_dependence() computes conditional partial dependence #27441

Open
markusloecher opened this issue Sep 22, 2023 · 9 comments · May be fixed by #28595 or #28771
Open

partial_dependence() computes conditional partial dependence #27441

markusloecher opened this issue Sep 22, 2023 · 9 comments · May be fixed by #28595 or #28771

Comments

@markusloecher
Copy link

markusloecher commented Sep 22, 2023

Describe the bug

For the case of correlated predictors (clearly highly common) the sklearn.inspection.partial_dependence() function gives different answers for method = "recursion" and method = "brute", see my post for elaborate examples.

I do not believe that this is intentional and should be fixed. Alternatively, it should be communicated clearly in the documentation that (i) the two methods are not equivalent for tree based algorithms, and (ii) that method = "recursion" actually computes the conditional $E[f(x_S,X_C)|X_S=x_s]$ instead of the (desired) interventional $E[f(x_S,X_C)| \mathbf{do}(X_S=x_s)]$

Steps/Code to Reproduce

from sklearn.tree import DecisionTreeRegressor
import numpy as np
import pandas as pd
from sklearn.inspection import PartialDependenceDisplay, partial_dependence

#Generate the data
#first X
N = 400; p0 = 0.5; p11 = 0.8; M=2
X = np.zeros((N,M)) # a matrix (N * M)
N1 = int(p0*N)
X[0:N1,0] = 1
X[0:int(p11*N1),1] = 1
X[N1:(N1+int((1-p11)*(N-N1))+1),1] = 1
df = pd.DataFrame(X, columns=["X0", "X1"])
#then Y
y = np.zeros(N)
y[(X[:,0] == 0) & (X[:,1] == 0)] = 0.3
y[(X[:,0] == 0) & (X[:,1] == 1)] = 0.7
y[(X[:,0] == 1) & (X[:,1] == 0)] = 0.9
y[(X[:,0] == 1) & (X[:,1] == 1)] = 0.1

df = pd.DataFrame(X, columns=["X0", "X1"])
df["y"] = y

model = DecisionTreeRegressor(max_depth=2, max_features = 2)
model.fit(df[["X0","X1"]], y);

features = ["X1"]
for f in features:
    pdp_interventional = partial_dependence(model, df[["X0","X1"]], f,method = "brute")
    pdp_conditional = partial_dependence(model, df[["X0","X1"]], f,method = "recursion")
    
    print(f, "brute (interventional):", pdp_interventional['average'])
    print(f, "recursion (conditional):", pdp_conditional['average'])

pdp_interventional['average'] == pdp_conditional['average']

Expected Results

We would like the two methods to yield the same pdp values, so the last line should yield

array([[True, True]])

Actual Results

Instead they are different. The print statements yield
X1 brute (interventional): [[0.6 0.4]]
X1 recursion (conditional): [[0.42 0.22]]

and the last line yields
array([[False, False]])

Versions

System:
    python: 3.9.0 (v3.9.0:9cf6752276, Oct  5 2020, 11:29:23)  [Clang 6.0 (clang-600.0.57)]
executable: /Users/loecherm/.pyenv/versions/aug_hs_env/bin/python
   machine: macOS-10.16-x86_64-i386-64bit

Python dependencies:
      sklearn: 1.2.0
          pip: 23.0.1
   setuptools: 49.2.1
        numpy: 1.23.5
        scipy: 1.10.0
       Cython: None
       pandas: 1.5.3
   matplotlib: 3.5.3
       joblib: 1.2.0
threadpoolctl: 3.1.0

Built with OpenMP: True
@markusloecher markusloecher added Bug Needs Triage Issue requires triage labels Sep 22, 2023
@glemaitre
Copy link
Member

Thanks @markusloecher for the details notes. Reading the tests, it seems that this something that is expected but it should not this way.

I makes me think about discussion in SHAP regarding the feature perturbation that can be set to "intervational" or "tree_path_dependent" (https://shap.readthedocs.io/en/latest/generated/shap.TreeExplainer.html#shap.TreeExplainer).

It boils down to the same issue. I think that it would make sense to have the interventional approach always and fix the current behaviour. In the future, we could think of the implication of proposing the conditional approach, both in terms of API but even more in terms of implication regarding the interpretation of such results.

@markusloecher It looks like you have a fix in head in the notes that you wrote. Do you feel like contributing a bug fix?

@glemaitre glemaitre removed the Needs Triage Issue requires triage label Sep 22, 2023
@markusloecher
Copy link
Author

Thanks for the super quick reply! Yes, indeed these issues are quite similar to the "interventional" or "tree_path_dependent" SHAP discussions!
Maybe one difference here is that the "recursive" way is MUCH faster than the "brute" method, hence its appeal.

While I did sketch out a conceptual fix, I am unfortunately nowhere close to implementing this in code. Especially not at the rigorous sklearn quality level.
So I would probably pass on that nice opportunity for now.

Thanks,
ML

@markusloecher
Copy link
Author

Just following up one more time on this.
It seems that the default is still "recursive" for tree based methods ?
Might the Note

The ‘brute’ and ‘recursion’ methods will likely disagree regarding the value of the partial dependence, because they will treat these unlikely samples differently.

be new ? I am not certain (yet) that the differences are caused by highly unlikely samples.

@lorentzenchr
Copy link
Member

While I did sketch out a conceptual fix,

If you have a fix for the recursive tree method, even if very rough, we'd be interested! The fix is the problem, the polishing is the easy part.

@markusloecher
Copy link
Author

Will post one very soon !

@lorentzenchr
Copy link
Member

@mayer79 Do you have any insights or recommendations here?

@mayer79
Copy link
Contributor

mayer79 commented Jan 29, 2024

@lorentzenchr : the two methods are different, and they estimate different things. (Just like TreeSHAP and permutation SHAP. ) I think it would be good to mention this in the docu, but of course it is not a bug.

@markusloecher
Copy link
Author

Well, of course both methods have their place. I would still consider the current setup a bug since the assumption is that for trees the recursive way yields the same results as the brute force; exemplified e.g. by this internal sklearn code:

@lorentzenchr
Copy link
Member

I think it would be good to mention this in the docu

We should definitely document this. PR very welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment