Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some syntax errors are not reported when linting from stdin #357

Closed
stevenkaras opened this issue Aug 12, 2018 · 7 comments
Closed

Some syntax errors are not reported when linting from stdin #357

stevenkaras opened this issue Aug 12, 2018 · 7 comments

Comments

@stevenkaras
Copy link
Contributor

stevenkaras commented Aug 12, 2018

Minimal reproduction (at least for Python 2.7.12 on Ubuntu 16.04 using pyflakes 2.0.0):

max(1 for i in range(10), key=lambda x: x+1)

will not report the syntax error "Generator expression must be parenthesized if not sole argument" if piped into stdin as follows:

echo 'max(1 for i in range(10), key=lambda x: x+1)' | pyflakes

instead failing with a decoding error. However, this is not handled well by many integrations which parse pyflakes output for a lineno and offset.

It looks like there was an assumption in 2009 (from b73a3d1) that if text is None, then it's necessarily a decoding issue. I've yet to debug exactly why this specific class of syntax error does not include the text when reading from stdin.

EDIT: just confirmed this affects python3 as well (using Python 3.7.0 on Alpine 3.8 using pyflakes 2.0.0)

@myint
Copy link
Member

myint commented May 24, 2019

Seems that if I put in any arbitrary valid filename into ast.parse(), it properly raises the syntax error. On the other hand, if the file does not exist, it does the current behavior.

tree = ast.parse(codeString, filename=filename)

@stevenkaras
Copy link
Contributor Author

Interesting. To be more accurate, it does not correctly raise the syntax error, but rather reads the text from a file with the given filename if it exists, which need not have anything to do with the actual code you passed into ast.parse.

In any case, it seems to me that pyflakes should not assume that if text is None that it's an encoding error. Perhaps it would be enlightening to see under what conditions python's builtin compile() doesn't include the text field?

For context, I discovered this because SublimeLinter pipes its buffer into pyflakes on stdin, and silently drops all reported issues that don't have a FILENAME:LINE:POSITION format.

@Hanaasagi
Copy link
Contributor

Hanaasagi commented Jul 14, 2019

When raise SyntaxError, it will read the code that have invalid syntax from file. At least, if file is not exist, text field will be None.

https://github.com/python/cpython/blob/c6b31061997526b31961ec34328408ca421f51fc/Python/ast.c#L681-L685

    loc = PyErr_ProgramTextObject(c->c_filename, LINENO(n));
    if (!loc) {
        Py_INCREF(Py_None);
        loc = Py_None;
    }

https://github.com/python/cpython/blob/c6b31061997526b31961ec34328408ca421f51fc/Python/errors.c#L1203-L1215

PyObject *
PyErr_ProgramTextObject(PyObject *filename, int lineno)
{
    FILE *fp;
    if (filename == NULL || lineno <= 0)
        return NULL;
    fp = _Py_fopen_obj(filename, "r" PY_STDIOTEXTMODE);
    if (fp == NULL) {
        PyErr_Clear();
        return NULL;
    }
    return err_programtext(fp, lineno);
}

simple test here.

echo 'for test' > \<stdin\>echo 'max(1 for i in range(10), key=lambda x: x+1)' | pyflakes
<stdin>:1:4: Generator expression must be parenthesized
for test
   ^

Maybe we should treat '<stdin>' specially.

@myint
Copy link
Member

myint commented Jul 16, 2019

Yeah, seems like we might have to let the code know when stdin is being used. And I guess looking it up by the name <stdin> might not be good enough given that there might be an actual file named <stdin> that is being referenced.

@asottile
Copy link
Member

in python3.9+ this does the right thing -- presumably due to a fix from cpython's side:

$ echo 'max(1 for i in range(10), key=lambda x: x+1)' | python3.9 -m pyflakes
<stdin>:1:5: Generator expression must be parenthesized
max(1 for i in range(10), key=lambda x: x+1)
    ^

@stevenkaras
Copy link
Contributor Author

A quick test shows that python 3.6 through 3.8 are affected by this (tested with 3.6.15, 3.7.13, and 3.8.13 on Mac and asdf).
The good news is that msg, lineno, and offset are valid and contain relevant information.
I'll see about pushing a PR in the next day or two that at least renders as much information as possible rather than dropping an otherwise useful error message.

@asottile
Copy link
Member

personally I don't think it's worth it given it works fine in newer python

stevenkaras added a commit to stevenkaras/pyflakes that referenced this issue Jul 13, 2022
In b73a3d1, there was an assumption that text is None only if there was an
encoding error with the file. However this was the case for all pythons before
3.9 when reading code from stdin.

This takes care to correctly report as much context as possible, so errors
aren't silently dropped with the unhelpful "problem decoding source" message.
stevenkaras added a commit to stevenkaras/pyflakes that referenced this issue Jul 14, 2022
In b73a3d1, there was an assumption that text is None only if there was an
encoding error with the file. However this was the case for all pythons before
3.9 when reading code from stdin.

This takes care to correctly report as much context as possible, so errors
aren't silently dropped with the unhelpful "problem decoding source" message.
stevenkaras added a commit to stevenkaras/pyflakes that referenced this issue Jul 16, 2022
In b73a3d1, there was an assumption that text is None only if there was an
encoding error with the file. However this was the case for all pythons before
3.9 when reading code from stdin.

This takes care to correctly report as much context as possible, so errors
aren't silently dropped with the unhelpful "problem decoding source" message.
asottile pushed a commit that referenced this issue Jul 19, 2022
In b73a3d1, there was an assumption that text is None only if there was an
encoding error with the file. However this was the case for all pythons before
3.9 when reading code from stdin.

This takes care to correctly report as much context as possible, so errors
aren't silently dropped with the unhelpful "problem decoding source" message.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants