-
-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace greedy completer with guarded evaluation, improve dict completions #13852
Conversation
- completion of integer keys - completion in pandas for loc indexer `.loc[:, <tab>`
improve number handling and static typing
Averaging across all scenarios this PR has a small overall performance impact of 6.41% (from grand average of 11.29 ms to 12.01 ms; before refactor the average would have been 11.72 ms). I would argue that this is ok given the added features. Breakdown by parameter (each scenario has 5 runs x # other parameter combinations; mean [1st - 3rd quartile]):
The case with very long tokens (keys) in the dictionary appears worth further investigation (though I think that it should not hold this PR as it is rather uncommon to have so long dictionary keys). import timeit
import subprocess
import pandas as pd
from tqdm.auto import tqdm
import IPython
from IPython import get_ipython
def time_dict_completions(
token_len = 10,
access_key_len=10,
n_tokens = 100,
dict_name_len = 10,
query = '[',
nested_n=0,
greedy=False
):
ip = get_ipython()
dict_name = "x" * dict_name_len
leaf_dict = {
f'{i}'.ljust(token_len, 'x'): None
for i in range(n_tokens)
}
top_dict = {}
current_dict = top_dict
access_key = 'b' * access_key_len # b as in branch
for i in range(nested_n):
current_dict[access_key] = {}
current_dict.update(leaf_dict)
ip.user_ns[dict_name] = top_dict
ip.Completer.greedy = greedy
line = f'{dict_name}{query}'
return timeit.timeit('ip.complete(line)', globals=locals(), number=5)
def test_grid():
return [
dict(
n_tokens=n_tokens,
token_len=token_len,
nested_n=nested_n,
query=query,
greedy=greedy
)
for greedy in [True, False]
for n_tokens in [5, 10, 100, 1000]
for token_len in [5, 10, 25]
for nested_n in [0, 1, 2, 3, 4, 5]
for query in ['[', '', '["1', '["1111111"]']
]
if __name__ == '__main__':
result = pd.DataFrame([
{
'time': time_dict_completions(**opts),
**opts
}
for opts in tqdm(test_grid())
])
commit = (
subprocess
.check_output(['git', 'rev-parse', '--short', 'HEAD'])
.decode('ascii')
.strip()
)
result.to_csv(f'dict_completions_times_{IPython.__version__}_{commit}.csv') |
Thanks, appologies for the delay in reviewing, let's try it. |
@krassowski Wonder how this is supposed to be used (e.g. by library developers)? While it's nice that numpy / pandas are hardcoded in, is it possible to modify the selective policy at runtime from within the kernel? (e.g. to allow eager getattr/getitem on some |
@aldanor this is a good question. One could swap policies in |
I kind of got the point of something like this from IPython.core import guarded_eval
types = [('a', 'b', 'X'), ('a', 'b', 'Y'), ...]
guarded_eval.EVALUATION_POLICIES['limited'].allowed_getattr_external.update(types)
guarded_eval.EVALUATION_POLICIES['limited'].allowed_getitem_external.update(types)
guarded_eval.SUPPORTED_EXTERNAL_GETITEM.update(types) but this feels extremely fiddle and fragile and I'm not sure it fully works 🙂 Also, I can't seem to be able to make this work: class X:
@property
def y(self) -> Y: ...
class Y:
def __getitem__(self, index): ...
def _ipython_key_completions_(self): ... Something like this triggers getitem completion just fine >>> y = x.y
>>> y[' whereas this doesn't >>> x.y[' and so I started digging to see if it's possible by modifying guarded eval policies and white-listing all of the above and it doesn't seem to solve it either 🤔 (not sure it's directly related but perhaps it is, so I figured I'd mention the use case)
I guess I could, once I figure how to solve my use case so I have an idea of how it works (could be something like a |
Update: the above seems to have been resolved by adding the type Would that be a preferred course of action then, patching |
Addresses #13735 (comment) and #13836.
Guarded evaluation
A new module with
guarded_eval()
function is added offering guarded evaluation of Python code. It uses the built-inast
parser (no dependencies), similarly to existing solutions:ast.literal_eval
- too restricted, cannot handle most expressions that we needasteval
- is close but aimed at a different use caseThe purpose of
guarded_eval()
is to guard against unwanted side-effects only, no further guarantees are made.guarded_eval
does not evaluate actions which always have side effects:class sth: ...
)def sth: ...
)x = 1
)x += 1
)del x
)It also does not evaluate operations which do not return values:
assert x
)pass
)import x
)if x:
) except for ternary IfExp (a if x else b
)for
and `while``)Since other parts of the codebase call bare
eval
I would like to follow-up replacing these calls with saferguarded_eval
across the board.Better dict key completions
Enhancements to dictionary key completions:
eval()
was liftedeval()
was replaced with safeguarded_eval()
__getattr__
; if user-defined__getattr__
or__getitem__
gets encountered on the evaluation path, the evaluation will be aborted to avoid triggering side effects.loc
indexerChanges in user config options
Completer.evaluation
replacesCompleter.greedy
Successive options allow to enable more eager evaluation for more accurate completion suggestions,
including for nested dictionaries, nested lists, or even results of function calls. Setting
unsafe
or higher can lead to evaluation of arbitrary user code on TAB with potentially dangerous side effects.
Allowed values are:
forbidden
: no evaluation at allminimal
: evaluation of literals and access to built-in namespaces; no item/attribute evaluation nor access to locals/globalslimited
(default): access to all namespaces, evaluation of hard-coded methods (keys()
,__getattr__
,__getitems__
, etc) on allow-listed objects (e.g.dict
,list
,tuple
,pandas.Series
)unsafe
: evaluation of all methods and function calls but not of syntax with side-effects likedel x
,dangerous
: completely arbitrary evaluationCompleter.auto_close_dict_keys
is addedPreviously closing of strings and brackets was only ever active in greedy completer. Now the logic was expanded to handle keys of tuples correctly; this setting allows to enable auto-closing regardless of evaluation policy.
Completer.greedy
is deprecatedWhen enabled in
Completer.greedy
activates following settings for compatibility:evaluation = 'unsafe'
auto_close_dict_keys = True
While all old tests for greedy completer pass with these settings, the old greedy completer was essentially running on
dangerous
mode. I am not sure if we should be adding it, but I wanted to keep it so users can change toevaluation = 'dangerous'
at least in the next version in case if we need to fix something withguarded_eval
.Remaining tasks
NTH: