Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Documentation Inaccuracy Regarding Feature Interaction Constraints #10169

Open
cbongiorno opened this issue Apr 5, 2024 · 0 comments

Comments

@cbongiorno
Copy link

There seems to be a discrepancy between the XGBoost documentation on Feature Interaction Constraints and its actual behavior in practice. The documentation suggests that using constraints like [[0, 1], [1, 3, 4]] would allow a feature (like 0) to interact with features 3 or 4 via an intermediary feature (like 1). However, empirical tests suggest this interaction does not occur as documented.

Steps to Reproduce

  1. Create a dataset with three standardized normal variables.
  2. Define the target variable y as the product of these variables.
  3. Train an XGBoost model with the feature interaction constraint [[0, 1], [1, 2]].
  4. Observe the resulting trees and model performance.

Expected Behavior: Based on the documentation, one would expect the model to learn interactions involving all three variables (0, 1, 2) in at least some trees.

Actual Behavior: The model does not seem to learn the expected product relationship, and none of the trees feature all three variables simultaneously.

If the observed behavior is correct, could you please update the documentation to reflect the actual constraint mechanism?

In my opinion, as a data scientist, having an interaction constraint that limits interactions with immediate neighbors in a decision tree, as explained in the documentation, is not truly beneficial. Instead, a more helpful approach would be to prohibit the simultaneous presence of nodes that are not set to interact throughout the entire tree, as seems to occur in practice, or at least within a branch originating from the root node.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant