Some syntax errors are not reported when linting from stdin #357

stevenkaras · 2018-08-12T11:38:11Z

Minimal reproduction (at least for Python 2.7.12 on Ubuntu 16.04 using pyflakes 2.0.0):

max(1 for i in range(10), key=lambda x: x+1)

will not report the syntax error "Generator expression must be parenthesized if not sole argument" if piped into stdin as follows:

echo 'max(1 for i in range(10), key=lambda x: x+1)' | pyflakes

instead failing with a decoding error. However, this is not handled well by many integrations which parse pyflakes output for a lineno and offset.

It looks like there was an assumption in 2009 (from b73a3d1) that if text is None, then it's necessarily a decoding issue. I've yet to debug exactly why this specific class of syntax error does not include the text when reading from stdin.

EDIT: just confirmed this affects python3 as well (using Python 3.7.0 on Alpine 3.8 using pyflakes 2.0.0)

The text was updated successfully, but these errors were encountered:

myint · 2019-05-24T16:07:02Z

Seems that if I put in any arbitrary valid filename into ast.parse(), it properly raises the syntax error. On the other hand, if the file does not exist, it does the current behavior.

pyflakes/pyflakes/api.py

Line 41 in 0b16364

tree = ast.parse(codeString, filename=filename)

stevenkaras · 2019-05-26T06:34:05Z

Interesting. To be more accurate, it does not correctly raise the syntax error, but rather reads the text from a file with the given filename if it exists, which need not have anything to do with the actual code you passed into ast.parse.

In any case, it seems to me that pyflakes should not assume that if text is None that it's an encoding error. Perhaps it would be enlightening to see under what conditions python's builtin compile() doesn't include the text field?

For context, I discovered this because SublimeLinter pipes its buffer into pyflakes on stdin, and silently drops all reported issues that don't have a FILENAME:LINE:POSITION format.

Hanaasagi · 2019-07-14T11:11:02Z

When raise SyntaxError, it will read the code that have invalid syntax from file. At least, if file is not exist, text field will be None.

https://github.com/python/cpython/blob/c6b31061997526b31961ec34328408ca421f51fc/Python/ast.c#L681-L685

    loc = PyErr_ProgramTextObject(c->c_filename, LINENO(n));
    if (!loc) {
        Py_INCREF(Py_None);
        loc = Py_None;
    }

https://github.com/python/cpython/blob/c6b31061997526b31961ec34328408ca421f51fc/Python/errors.c#L1203-L1215

PyObject *
PyErr_ProgramTextObject(PyObject *filename, int lineno)
{
    FILE *fp;
    if (filename == NULL || lineno <= 0)
        return NULL;
    fp = _Py_fopen_obj(filename, "r" PY_STDIOTEXTMODE);
    if (fp == NULL) {
        PyErr_Clear();
        return NULL;
    }
    return err_programtext(fp, lineno);
}

simple test here.

➜ echo 'for test' > \<stdin\>
➜ echo 'max(1 for i in range(10), key=lambda x: x+1)' | pyflakes
<stdin>:1:4: Generator expression must be parenthesized
for test
   ^

Maybe we should treat '<stdin>' specially.

myint · 2019-07-16T13:01:11Z

Yeah, seems like we might have to let the code know when stdin is being used. And I guess looking it up by the name <stdin> might not be good enough given that there might be an actual file named <stdin> that is being referenced.

asottile · 2022-07-11T17:12:38Z

in python3.9+ this does the right thing -- presumably due to a fix from cpython's side:

$ echo 'max(1 for i in range(10), key=lambda x: x+1)' | python3.9 -m pyflakes
<stdin>:1:5: Generator expression must be parenthesized
max(1 for i in range(10), key=lambda x: x+1)
    ^

stevenkaras · 2022-07-13T09:03:23Z

A quick test shows that python 3.6 through 3.8 are affected by this (tested with 3.6.15, 3.7.13, and 3.8.13 on Mac and asdf).
The good news is that msg, lineno, and offset are valid and contain relevant information.
I'll see about pushing a PR in the next day or two that at least renders as much information as possible rather than dropping an otherwise useful error message.

asottile · 2022-07-13T12:06:03Z

personally I don't think it's worth it given it works fine in newer python

In b73a3d1, there was an assumption that text is None only if there was an encoding error with the file. However this was the case for all pythons before 3.9 when reading code from stdin. This takes care to correctly report as much context as possible, so errors aren't silently dropped with the unhelpful "problem decoding source" message.

myint added bug help wanted labels May 24, 2019

asottile closed this as completed Jul 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Some syntax errors are not reported when linting from stdin #357

Some syntax errors are not reported when linting from stdin #357

stevenkaras commented Aug 12, 2018 •

edited

myint commented May 24, 2019

stevenkaras commented May 26, 2019

Hanaasagi commented Jul 14, 2019 •

edited

myint commented Jul 16, 2019 •

edited

asottile commented Jul 11, 2022

stevenkaras commented Jul 13, 2022

asottile commented Jul 13, 2022

Some syntax errors are not reported when linting from stdin #357

Some syntax errors are not reported when linting from stdin #357

Comments

stevenkaras commented Aug 12, 2018 • edited

myint commented May 24, 2019

stevenkaras commented May 26, 2019

Hanaasagi commented Jul 14, 2019 • edited

myint commented Jul 16, 2019 • edited

asottile commented Jul 11, 2022

stevenkaras commented Jul 13, 2022

asottile commented Jul 13, 2022

stevenkaras commented Aug 12, 2018 •

edited

Hanaasagi commented Jul 14, 2019 •

edited

myint commented Jul 16, 2019 •

edited