Predictable and safe object inspection #13833

jasongrout · 2022-11-17T20:32:47Z

In #13759, we discussed how Colab's inspection routine guards against calling an arbitrary object's __repr__() method, since in practice that can take a lot of time and/or memory, depending on how the object was implemented.

This PR experiments with including an updated/modified version of the Colab implementation (licensed Apache 2.0), and making it easy to configure the interactive shell's inspector implementation.

Use the new inspector with

ipython --InteractiveShell.inspector_class='IPython.core.oinspect_safe.SafeInspector'

CC @craigcitro

…ariant of the inspector Fixes ipython#13759 (or at least is the start of a fix for ipython#13759) This new file is from https://github.com/googlecolab/colabtools/blob/feb5dde3839251fd6fd8bf23e081d0a8b9a8187b/google/colab/_inspector.py

…ance info to the module

…e from objects

…8.x)

…es the info for pinfo, etc.

jasongrout · 2022-11-17T20:40:19Z

I see a difference in the %pinfo output between the two inspectors:

class BadRepr:
    def __repr__(self):
        import time
        time.sleep(10)
        return 'a very long string'*100

Default inspector:

In [2]: %pinfo BadRepr
Init signature: BadRepr()
Docstring:      <no docstring>
Type:           type
Subclasses:

SafeInspector:

In [2]: %pinfo BadRepr
Type:        type
String form: <class '__main__.BadRepr'>
Docstring:   <no docstring>

jasongrout · 2022-11-17T20:57:53Z

Colab special-cases tensorflow objects:

        if (
            isinstance(shape, tuple)
            or hasattr(shape, "__module__")
            and isinstance(shape.__module__, str)
            and "tensorflow." in shape.__module__
        ):
            return "{} with shape {}".format(type_name, shape)

Should the IPython implementation also specialcase tensorflow? If not, can we make it easy for a subclass to have its own special cases at this point?

…ream code it is modifying for easier maintenance oinspect._render_signature is basically a copy of inspect.Signature.__str__ with a few tweaks. However, comparing the two functions is hard since it was basically rewritten for _render_signature. This commit makes _render_signature nearly identical to inspect.Signature.__str__ so it can be easily compared in the future, with clear notes about what was changed. There should be no behavior difference before and after this change.

…tdef With this change, the code between the two functions is largely identical, with the change that the signature default and annotation values are replaced with versions with safe repr functions. With this change, the SafeInspector._getdef now adopts the line-breaking behavior of Inspector._getdef.

This also introduces new getstr and getlen methods that can be overridden to have customized string and len methods. For example, for a safe inspector, these may avoid calling the object methods in case it is not safe.

…ormation This was added in Colab's implementation, and could be used to highlight the lines of the definition of an object, for example.

…so no need to catch them again

There should be no change in functionality from this commit

- simplify many of the methods - align implementation of info() more closely with upstream oinspect info()

…ting getstr and getlen By now, the oinspect info() implementation and the safe inspect info() implementation are basically aligned, with the upstream implementation calling out to getstr and getlen when it is formatting an object for display.

jasongrout · 2022-11-19T20:58:42Z

I tried to unify the safe inspect info() and oinspect info() implementations. They mostly align. There have been many updates to oinspect info(), which I tried to reason about in the context of the safe inspection. There are still a few notes about TODO items I'm not sure about, but by in large this brings safe inspection in line with what the oinspect info currently produces. It seems that there are only a few key places where we needed to change the upstream implementation, so I put stubs in for functions the safe implementation can override. Then deleted the safe_inspect info() code. This greatly simplifies things. I spent a lot of time reading the Python inspect module code to convince myself that it was safe to rely on standard library code for more things. I tried to clearly indicate reasons behind many of the changes in the commits in this PR.

This is tested in test_oinspect.py:test_info_awkward, for example.

…ure function to not call an object's repr method.

…not call repr() or str() where possible

jasongrout · 2022-11-21T22:50:13Z

I tried running all of our existing oinspect tests with the SafeInspector inspector, along with one other test:

class BadRepr:
    """
    A honeypot repr that should not be called
    """
    def __repr__(self):
        import time
        breakpoint()
        time.sleep(10)
        raise ValueError("DON'T CALL ME")
        return 'A VERY BAD REPR'

def test_info_badrepr():
    inspector.info(BadRepr())

and we have a problem: the Python inspect module calls the repr of an object in many different places for error messages.

The original Colab code sort of gets around this by trying to do the following before calling inspect's signature method:

Get the source lines for the definition
Parse the source lines for the definition
Set the function body to nothing
Unparse the function again, which just yields the signature, basically
If this doesn't work, then call inspect.signature and hope for the best

inspect.signature is significantly more advanced (for example, it handles
However, inspect.signature is significantly more advanced. For example, it looks for a __signature__ attribute, partial methods, classes with __new__ or __init__, an inheritance hierarchy, etc.). In order to get equivalent functionality without the risk of calling an object's repr, I instead copied the parts of inspect.py that do the signature analysis and updated anywhere it calls an object's repr to instead call safe_repr. This does mean that we need to maintain a fork of core Python code, but the changes are minimal and likely easy to update.

So which approach should we use?

Make a copy of parts of inspect.py and modify it to not call repr() or str() on an object, its parameters, or annotations?

Risk: the copy of inspect.py needs to work in all supported Python versions (I copied from Python 3.11)
Risk: updating the copy of inspect.py periodically needs to be done. Fortunately, the changes are quite minimal and easy to understand
Advantage: we get all the functionality of inspect.signature without any of the problems in calling an object's repr()

Do Colab's approach of a best-effort guess at the callable's signature by parsing the source if we can find it, and if that doesn't work, just go ahead and call inspect.signature anyway and hope for the best?

Risk: We could still call an object's repr in many cases
Advantage: A lot less code to maintain and update

To experiment, I made the copy of parts of inspect.py and updated it to not call repr() or str() on the object in question or related things. It wasn't too bad, but it will be a bit of a pain to maintain going forward.

Fix the test to be passing as expected

…fe_signature.py We don't want to reformat oinspect_safe_signature.py so that it is easy to diff against upstream inspect.py

jasongrout · 2022-12-11T06:09:36Z

Update: after thinking about this for a bit, I think it's untenable to carry a copy of the Python inspect.py file. I'm looking into how to back that part out and make the safe inspect lighter weight.

Another option is to go upstream to Python and ask if they can not print the repr of the object in error messages. Then we can just mostly use upstream inspect.

jasongrout added 7 commits November 17, 2022 11:37

Format with darker to conform to ipython conventions

2606566

Modify class names to be more generic and add more context and proven…

c6d9c20

…ance info to the module

Use ast.unparse (Python 3.9+) instead of astor to generate Python cod…

254e458

…e from objects

Remove the obsolete formatter argument to info() (removed in IPython …

6444cc7

…8.x)

Populate the subclasses attribute, required in how current IPython us…

16dd44b

…es the info for pinfo, etc.

Make inspector class a configurable attribute in InteractiveShell.

a474947

jasongrout mentioned this pull request Nov 17, 2022

Make inspect calls more predictable and safe #13759

Open

jasongrout added 19 commits November 18, 2022 11:48

Simplify oinspect.getdoc, should be no change in behavior.

1890711

Better oinspect docstring

865f99a

Simplify getting defaults in oinspect info()

35d573f

Reorganize first part of oinspect info()

dff3b93

This also introduces new getstr and getlen methods that can be overridden to have customized string and len methods. For example, for a safe inspector, these may avoid calling the object methods in case it is not safe.

Add source_start_line and source_end_line fields to oinspect info inf…

2ff4b22

…ormation This was added in Colab's implementation, and could be used to highlight the lines of the definition of an object, for example.

Add the isclass field as True/False always (instead of True/None)

5ec46af

Simplify oinspect info: _getdef already catches any possible errors, …

2777633

…so no need to catch them again

Simplify oinspect info class docstring/definition code.

19e0540

There should be no change in functionality from this commit

Simplify oinspect info: rename variables to make them more explicit

4aa8a55

There should be no change in functionality from this commit

Simplify oinspect info getting class docstrings

b20fa3c

Use inspect.unwrap to implement oinspect _get_wrapped

9d28937

Simplify safeinspect

997ec36

- simplify many of the methods - align implementation of info() more closely with upstream oinspect info()

Update oinspect unwrap method to never raise an error

a713132

Fix indentation error

c9e83ca

Run darker on oinspect and oinspect_safe

ecd7f78

import undoc in safe oinspect

8426ecc

jasongrout added 3 commits November 19, 2022 14:00

fix find_source_lines typo

0cf2123

Sometimes find_source_lines may throw an exception, so guard against it

173b1d7

This is tested in test_oinspect.py:test_info_awkward, for example.

Fix typo in defaults

42a2913

jasongrout force-pushed the safeinspect branch from 0bad685 to 42a2913 Compare November 20, 2022 03:40

jasongrout added 3 commits November 21, 2022 15:26

Copy parts of CPython's 3.11 inspect.py in order to modify the signat…

edf572c

…ure function to not call an object's repr method.

Modify our copy of parts of CPython's inspect.py from Python 3.11 to …

5bee696

…not call repr() or str() where possible

Update oinspect_safe to use the oinspect_safe_signature module methods

75ea7d6

jasongrout added 6 commits November 22, 2022 09:47

Parametrize inspector tests to run with both inspectors

67db7d9

Delete obsolete comment

f23afae

Refactor safe_repr and start using it in the safe_signature file

e2366c0

Fix the test to be passing as expected

Lint

a68509b

Move black config to pyproject.toml and update to exclude oinspect_sa…

a4106bd

…fe_signature.py We don't want to reformat oinspect_safe_signature.py so that it is easy to diff against upstream inspect.py

Fix test setup and doc error

09d1ad2

jasongrout mentioned this pull request Dec 9, 2022

Make inspector class a configurable attribute in InteractiveShell #13865

Merged

jasongrout marked this pull request as draft December 11, 2022 06:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Predictable and safe object inspection #13833

Predictable and safe object inspection #13833

jasongrout commented Nov 17, 2022 •

edited

jasongrout commented Nov 17, 2022

jasongrout commented Nov 17, 2022

jasongrout commented Nov 19, 2022

jasongrout commented Nov 21, 2022

jasongrout commented Dec 11, 2022

Predictable and safe object inspection #13833

Are you sure you want to change the base?

Predictable and safe object inspection #13833

Conversation

jasongrout commented Nov 17, 2022 • edited

jasongrout commented Nov 17, 2022

jasongrout commented Nov 17, 2022

jasongrout commented Nov 19, 2022

jasongrout commented Nov 21, 2022

jasongrout commented Dec 11, 2022

jasongrout commented Nov 17, 2022 •

edited