Potential Documentation Inaccuracy Regarding Feature Interaction Constraints #10169

cbongiorno · 2024-04-05T16:05:38Z

There seems to be a discrepancy between the XGBoost documentation on Feature Interaction Constraints and its actual behavior in practice. The documentation suggests that using constraints like [[0, 1], [1, 3, 4]] would allow a feature (like 0) to interact with features 3 or 4 via an intermediary feature (like 1). However, empirical tests suggest this interaction does not occur as documented.

Steps to Reproduce

Create a dataset with three standardized normal variables.
Define the target variable y as the product of these variables.
Train an XGBoost model with the feature interaction constraint [[0, 1], [1, 2]].
Observe the resulting trees and model performance.

Expected Behavior: Based on the documentation, one would expect the model to learn interactions involving all three variables (0, 1, 2) in at least some trees.

Actual Behavior: The model does not seem to learn the expected product relationship, and none of the trees feature all three variables simultaneously.

If the observed behavior is correct, could you please update the documentation to reflect the actual constraint mechanism?

In my opinion, as a data scientist, having an interaction constraint that limits interactions with immediate neighbors in a decision tree, as explained in the documentation, is not truly beneficial. Instead, a more helpful approach would be to prohibit the simultaneous presence of nodes that are not set to interact throughout the entire tree, as seems to occur in practice, or at least within a branch originating from the root node.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Documentation Inaccuracy Regarding Feature Interaction Constraints #10169

Potential Documentation Inaccuracy Regarding Feature Interaction Constraints #10169

cbongiorno commented Apr 5, 2024

Potential Documentation Inaccuracy Regarding Feature Interaction Constraints #10169

Potential Documentation Inaccuracy Regarding Feature Interaction Constraints #10169

Comments

cbongiorno commented Apr 5, 2024