Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace greedy completer with guarded evaluation, improve dict completions #13852

Merged
merged 16 commits into from
Dec 21, 2022

Conversation

krassowski
Copy link
Member

@krassowski krassowski commented Dec 3, 2022

Addresses #13735 (comment) and #13836.

Guarded evaluation

A new module with guarded_eval() function is added offering guarded evaluation of Python code. It uses the built-in ast parser (no dependencies), similarly to existing solutions:

  • ast.literal_eval - too restricted, cannot handle most expressions that we need
  • asteval - is close but aimed at a different use case

The purpose of guarded_eval() is to guard against unwanted side-effects only, no further guarantees are made.

guarded_eval does not evaluate actions which always have side effects:

  • class definitions (class sth: ...)
  • function definitions (def sth: ...)
  • variable assignments (x = 1)
  • augmented assignments (x += 1)
  • deletions (del x)

It also does not evaluate operations which do not return values:

  • assertions (assert x)
  • pass (pass)
  • imports (import x)
  • control flow
    • conditionals (if x:) except for ternary IfExp (a if x else b)
    • loops (for and `while``)
    • exception handling

Since other parts of the codebase call bare eval I would like to follow-up replacing these calls with safer guarded_eval across the board.

Better dict key completions

Enhancements to dictionary key completions:

  • completions of nested dictionaries will work by default
    • the restriction which was implemented using regular expression followed by unsafe eval() was lifted
    • the use of eval() was replaced with safe guarded_eval()
    • by default this only works for built-in __getattr__; if user-defined __getattr__ or __getitem__ gets encountered on the evaluation path, the evaluation will be aborted to avoid triggering side effects
    • evaluation policy can be adjusted via configuration (e.g. forbidding evaluation or relaxing the rules)
    d = {'a': {'b': None}}
    d['a'][<tab>   # will suggest 'b'
    l = [d]
    l[0][<tab>     # will suggest 'a'
  • add support for pandas columns when using .loc indexer
    df.loc[:, <tab>    # will suggest column names
  • add support for numbers
     d = {999: 'emergency'}
     d[9<tab>    # will suggest `999`
  • support auto-closing of strings and brackets outside of greedy completer
  • add support for proper auto-closing of tuple key items
    d = {('aa', 'bb'): None}
    d[<tab>   # will complete to `d['aa', `

Changes in user config options

Completer.evaluation replaces Completer.greedy

Successive options allow to enable more eager evaluation for more accurate completion suggestions,
including for nested dictionaries, nested lists, or even results of function calls. Setting unsafe
or higher can lead to evaluation of arbitrary user code on TAB with potentially dangerous side effects.
Allowed values are:

  • forbidden: no evaluation at all
  • minimal: evaluation of literals and access to built-in namespaces; no item/attribute evaluation nor access to locals/globals
  • limited (default): access to all namespaces, evaluation of hard-coded methods (keys(), __getattr__, __getitems__, etc) on allow-listed objects (e.g. dict, list, tuple, pandas.Series)
  • unsafe: evaluation of all methods and function calls but not of syntax with side-effects like del x,
  • dangerous: completely arbitrary evaluation

Completer.auto_close_dict_keys is added

Previously closing of strings and brackets was only ever active in greedy completer. Now the logic was expanded to handle keys of tuples correctly; this setting allows to enable auto-closing regardless of evaluation policy.

Completer.greedy is deprecated

When enabled in Completer.greedy activates following settings for compatibility:

  • evaluation = 'unsafe'
  • auto_close_dict_keys = True

While all old tests for greedy completer pass with these settings, the old greedy completer was essentially running on dangerous mode. I am not sure if we should be adding it, but I wanted to keep it so users can change to evaluation = 'dangerous' at least in the next version in case if we need to fix something with guarded_eval.

Remaining tasks

  • polish documentation
  • add evaluation guards against other dunder methods
  • add more tests
  • benchmark performance to ensure that this does not introduce performance regressions

NTH:

  • maybe port attribute matcher to remove greedy completer remnants (up for discussion as to avoid breaking downstream)
  • port dict key matcher to v2 and return "column" type for pandas rather than "dict key"; what is the right type name for numpy?

- completion of integer keys
- completion in pandas for loc indexer `.loc[:, <tab>`
improve number handling and static typing
@krassowski
Copy link
Member Author

Averaging across all scenarios this PR has a small overall performance impact of 6.41% (from grand average of 11.29 ms to 12.01 ms; before refactor the average would have been 11.72 ms). I would argue that this is ok given the added features.

Breakdown by parameter (each scenario has 5 runs x # other parameter combinations; mean [1st - 3rd quartile]):

param value main this PR
greedy 0 56.36 [5.74 - 101.93] 59.40 [5.66 - 104.03]
greedy 1 56.55 [5.79 - 103.30] 60.75 [5.69 - 103.33]
n_tokens 5 51.42 [4.11 - 97.68] 56.89 [5.39 - 102.51]
n_tokens 10 51.75 [5.11 - 102.20] 53.63 [5.45 - 103.61]
n_tokens 100 56.89 [5.58 - 111.42] 58.68 [5.65 - 120.21]
n_tokens 1000 65.78 [9.15 - 127.22] 71.12 [17.60 - 120.69]
nested_n 0 59.06 [5.76 - 100.31] 61.38 [5.69 - 104.28]
nested_n 1 55.29 [5.77 - 102.40] 60.07 [5.66 - 103.40]
nested_n 2 54.68 [5.76 - 98.67] 59.23 [5.65 - 103.10]
nested_n 3 55.72 [5.74 - 101.00] 59.71 [5.73 - 104.19]
nested_n 4 56.17 [5.79 - 103.14] 59.26 [5.65 - 101.84]
nested_n 5 57.83 [5.73 - 103.63] 60.81 [5.72 - 105.13]
query 91.10 [86.45 - 94.05] 94.56 [87.68 - 96.54]
query [ 123.56 [103.76 - 138.67] 130.70 [106.44 - 154.95]
query ["1 5.09 [2.88 - 6.35] 8.99 [3.23 - 10.98]
query ["1111111"] 6.08 [5.76 - 6.31] 6.07 [5.65 - 6.08]
token_len 5 57.48 [5.77 - 102.59] 60.89 [5.66 - 103.15]
token_len 10 56.97 [5.79 - 102.67] 59.53 [5.69 - 103.82]
token_len 25 54.92 [5.73 - 101.39] 59.82 [5.68 - 104.36]

The case with very long tokens (keys) in the dictionary appears worth further investigation (though I think that it should not hold this PR as it is rather uncommon to have so long dictionary keys).

import timeit
import subprocess

import pandas as pd
from tqdm.auto import tqdm

import IPython
from IPython import get_ipython


def time_dict_completions(
    token_len = 10,
    access_key_len=10,
    n_tokens = 100,
    dict_name_len = 10,
    query = '[',
    nested_n=0,
    greedy=False
):
    ip = get_ipython()
    dict_name = "x" * dict_name_len

    leaf_dict = {
        f'{i}'.ljust(token_len, 'x'): None
        for i in range(n_tokens)
    }
    top_dict = {}
    current_dict = top_dict

    access_key = 'b' * access_key_len  # b as in branch
    for i in range(nested_n):
        current_dict[access_key] = {}

    current_dict.update(leaf_dict)

    ip.user_ns[dict_name] = top_dict
    ip.Completer.greedy = greedy
    line = f'{dict_name}{query}'

    return timeit.timeit('ip.complete(line)', globals=locals(), number=5)


def test_grid():
    return [
        dict(
            n_tokens=n_tokens,
            token_len=token_len,
            nested_n=nested_n,
            query=query,
            greedy=greedy
        )
        for greedy in [True, False]
        for n_tokens in [5, 10, 100, 1000]
        for token_len in [5, 10, 25]
        for nested_n in [0, 1, 2, 3, 4, 5]
        for query in ['[', '', '["1', '["1111111"]']
    ]

if __name__ == '__main__':

    result = pd.DataFrame([
        {
            'time': time_dict_completions(**opts),
            **opts
        }
        for opts in tqdm(test_grid())
    ])
    commit = (
        subprocess
        .check_output(['git', 'rev-parse', '--short', 'HEAD'])
        .decode('ascii')
        .strip()
    )

    result.to_csv(f'dict_completions_times_{IPython.__version__}_{commit}.csv')

@krassowski krassowski marked this pull request as ready for review December 4, 2022 18:19
@krassowski
Copy link
Member Author

The coverage failure is unrelated; this PR should increase coverage but for some time now the coverage reports are incorrect (#13853) for which I sent a fix in #13854.

@Carreau
Copy link
Member

Carreau commented Dec 21, 2022

Thanks, appologies for the delay in reviewing, let's try it.

@Carreau Carreau merged commit 67d20af into ipython:main Dec 21, 2022
@Carreau Carreau added this to the 8.8 milestone Jan 3, 2023
@aldanor
Copy link

aldanor commented Mar 14, 2023

@krassowski Wonder how this is supposed to be used (e.g. by library developers)?

While it's nice that numpy / pandas are hardcoded in, is it possible to modify the selective policy at runtime from within the kernel? (e.g. to allow eager getattr/getitem on some foo.bar.Baz type?)

@krassowski
Copy link
Member Author

@aldanor this is a good question. One could swap policies in EVALUATION_POLICIES but this would not be convenient for library authors. I think it would be fine to expose a function allowing to add a custom allowed properties. Would you mind opening a new issue with reference to this PR and your thoughts on how you would see it?

@aldanor
Copy link

aldanor commented Mar 14, 2023

I kind of got the point of something like this

from IPython.core import guarded_eval
types = [('a', 'b', 'X'), ('a', 'b', 'Y'), ...]
guarded_eval.EVALUATION_POLICIES['limited'].allowed_getattr_external.update(types)
guarded_eval.EVALUATION_POLICIES['limited'].allowed_getitem_external.update(types)
guarded_eval.SUPPORTED_EXTERNAL_GETITEM.update(types)

but this feels extremely fiddle and fragile and I'm not sure it fully works 🙂


Also, I can't seem to be able to make this work:

class X:
    @property
    def y(self) -> Y: ...

class Y:
    def __getitem__(self, index): ...
    def _ipython_key_completions_(self): ...

Something like this triggers getitem completion just fine

>>> y = x.y
>>> y['

whereas this doesn't

>>> x.y['

and so I started digging to see if it's possible by modifying guarded eval policies and white-listing all of the above and it doesn't seem to solve it either 🤔 (not sure it's directly related but perhaps it is, so I figured I'd mention the use case)


Would you mind opening a new issue with reference to this PR and your thoughts on how you would see it?

I guess I could, once I figure how to solve my use case so I have an idea of how it works (could be something like a guarded_eval.whitelist(user_type, **kwargs)).

@aldanor
Copy link

aldanor commented Mar 14, 2023

Update: the above seems to have been resolved by adding the type X directly to limited policy's allowed_getattr set. Not sure why it doesn't work with allowed_getattr_external though.

Would that be a preferred course of action then, patching allowed_getattr at runtime for the limited policy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants