Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inaccurate Attribute Listing with dir(obj) for Classes Using available_if Conditional Method Decorator #28558

Open
kmcgrady opened this issue Mar 1, 2024 · 3 comments · May be fixed by #28749
Labels

Comments

@kmcgrady
Copy link

kmcgrady commented Mar 1, 2024

Describe the bug

When utilizing the available_if decorator from SciKit Learn to conditionally expose methods based on specific object state or conditions, we observe that the dir(obj) function may return inaccurate results. Specifically, dir(obj) continues to list methods that should be conditionally hidden based on the available_if decorator's logic. This discrepancy arises because the __dir__ method on the affected classes does not dynamically account for this conditional availability. As a result, users and consuming code may be misled about the actual methods available for use on instances of the class at runtime, potentially leading to unexpected AttributeErrors when accessing supposedly available methods.

Steps/Code to Reproduce

I will test with the SVC, but it can apply to other classes.

from sklearn.svm import SVC
from sklearn.datasets import load_iris

X, y = load_iris(return_X_y=True)

model = SVC(probability=False)
model.fit(X[:100], y[:100])

# Check if 'predict_proba' is listed by dir()
print("'predict_proba' in dir(model):", "predict_proba" in dir(model))

# Attempt to call 'predict_proba'
try:
    prob_predictions = model.predict_proba(X[:2])
    print("Predict_proba called successfully.")
except AttributeError as e:
    print("Attempting to call 'predict_proba' raised an AttributeError:", e)

Expected Results

It should print out the following:

'predict_proba' in dir(model): False
Attempting to call 'predict_proba' raised an AttributeError: predict_proba is not available when probability=False

Methods decorated with available_if and whose conditions raise an AttributeError should not appear in the list returned by dir(obj).

Actual Results

It should print out the following:

'predict_proba' in dir(model): True
Attempting to call 'predict_proba' raised an AttributeError: predict_proba is not available when probability=False

Methods decorated with available_if and whose conditions raise an AttributeError appears in the list returned by dir(obj).

Traceback when you show the error itself (comes from the check in available_if.

Traceback (most recent call last):
  File "/Users/kmcgrady/Projects/test-scripts/xg_example.py", line 48, in <module>
    prob_predictions = model.predict_proba(X[:2])
  File "/Users/kmcgrady/.local/share/virtualenvs/test-scripts-fLg6VU9n/lib/python3.10/site-packages/sklearn/utils/_available_if.py", line 31, in __get__
    if not self.check(obj):
  File "/Users/kmcgrady/.local/share/virtualenvs/test-scripts-fLg6VU9n/lib/python3.10/site-packages/sklearn/svm/_base.py", line 827, in _check_proba
    raise AttributeError(
AttributeError: predict_proba is not available when probability=False

Versions

System:
    python: 3.10.10 (main, May 13 2023, 16:09:51) [Clang 14.0.3 (clang-1403.0.22.14.1)]
executable: /Users/kmcgrady/.local/share/virtualenvs/test-scripts-fLg6VU9n/bin/python
   machine: macOS-14.3.1-arm64-arm-64bit

Python dependencies:
      sklearn: 1.3.2
          pip: 23.3.1
   setuptools: 68.2.2
        numpy: 1.26.4
        scipy: 1.11.4
       Cython: None
       pandas: 2.2.1
   matplotlib: 3.8.2
       joblib: 1.3.2
threadpoolctl: 3.2.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
    num_threads: 10
         prefix: libomp
       filepath: /Users/kmcgrady/.local/share/virtualenvs/test-scripts-fLg6VU9n/lib/python3.10/site-packages/sklearn/.dylibs/libomp.dylib
        version: None

       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /Users/kmcgrady/.local/share/virtualenvs/test-scripts-fLg6VU9n/lib/python3.10/site-packages/numpy/.dylibs/libopenblas64_.0.dylib
        version: 0.3.23.dev
threading_layer: pthreads
   architecture: armv8

       user_api: blas
   internal_api: openblas
    num_threads: 10
         prefix: libopenblas
       filepath: /Users/kmcgrady/.local/share/virtualenvs/test-scripts-fLg6VU9n/lib/python3.10/site-packages/scipy/.dylibs/libopenblas.0.dylib
        version: 0.3.21.dev
threading_layer: pthreads
   architecture: armv8
@kmcgrady kmcgrady added Bug Needs Triage Issue requires triage labels Mar 1, 2024
@kmcgrady
Copy link
Author

kmcgrady commented Mar 1, 2024

A simple solution should be straightforward where we override the __dir__ method on all classes that have conditional attributes.

def __dir__(self):
    original_dir = super().__dir__()
    return [attr for attr in original_dir if hasattr(self, attr)]

We may want to find ways to be smarter if anyone just uses avaliable_if in the class somehow, which is a bit more difficult.

kmcgrady added a commit to streamlit/streamlit that referenced this issue Mar 1, 2024
## Describe your changes

`st.help` leverages `dir(object)` to determine the members in the object
to expose to the docstring. If an object makes a member conditional
based on internal state, `st.help` can throw an `AttributeError`.
Ideally, in these cases, the developer of the object would implement the
`__dir__` method to handle conditional members, but Streamlit can be
more resilient in handling this situation.

The example that brought this up was in sklearn, and I created
scikit-learn/scikit-learn#28558 with them to
better handle this scenario.

## Testing Plan

- Unit Tests should cover the change with the member available and
unavailable

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
@MiguelParece
Copy link

im working on this.

@betatim
Copy link
Member

betatim commented Mar 8, 2024

I think it would be useful to fix this. Though I'd not be surprised if it got hairy real quick, doing magic like available_if tends to lead you down paths of "I never wanted to know this much about X".

Some cases that we should consider and make sure we don't make any of them worse (ideally we'd fix all of them but maybe we can't):

  • dir(est_instance)
  • help(est_instance)
  • dir(EstimatorClass) - should list all methods
  • tab-completion in something like IPython est_instance.<tab>
  • the generated HTML API docs should continue to list the method (I think)

The proposed __dir__ fixes all of them as far as I can tell, except help(est). This still lists the "hidden" method.

edit: though I am not sure if it is really a bug for help to keep listing the method. It internally uses pydoc which, I think, lists all methods defined on the class

@betatim betatim removed the Needs Triage Issue requires triage label Mar 8, 2024
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 1, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue Apr 2, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
@betatim betatim linked a pull request Apr 3, 2024 that will close this issue
zyxue pushed a commit to zyxue/streamlit that referenced this issue Apr 16, 2024
## Describe your changes

`st.help` leverages `dir(object)` to determine the members in the object
to expose to the docstring. If an object makes a member conditional
based on internal state, `st.help` can throw an `AttributeError`.
Ideally, in these cases, the developer of the object would implement the
`__dir__` method to handle conditional members, but Streamlit can be
more resilient in handling this situation.

The example that brought this up was in sklearn, and I created
scikit-learn/scikit-learn#28558 with them to
better handle this scenario.

## Testing Plan

- Unit Tests should cover the change with the member available and
unavailable

---

**Contribution License Agreement**

By submitting this pull request you agree that all contributions to this
project are made under the Apache 2.0 license.
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue May 6, 2024
Inaccurate Attribute Listing with dir(obj) in classes with conditional methods, I overwrote the python method _dir_ in all the classes that contain conditional methods
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue May 6, 2024
MiguelParece added a commit to MiguelParece/scikit-learn that referenced this issue May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
3 participants