Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-40958: Avoid buffer overflow in the parser when indexing the current line #20842

Closed
wants to merge 2 commits into from

Conversation

pablogsal
Copy link
Member

@pablogsal pablogsal commented Jun 12, 2020

@pablogsal
Copy link
Member Author

pablogsal commented Jun 12, 2020

Before this PR

venv ❯ LSAN_OPTIONS="suppressions=asan-suppression.txt,print_suppressions=0" ./python -m test test_eof
0:00:00 load avg: 0.85 Run tests sequentially
0:00:00 load avg: 0.85 [1/1] test_eof
test test_eof failed -- Traceback (most recent call last):
  File "/home/pablogsal/github/python/master/Lib/test/test_eof.py", line 54, in test_line_continuation_EOF_from_file_bpo2180
    self.assertIn(b'unexpected EOF while parsing', err)
AssertionError: b'unexpected EOF while parsing' not found in b'Parser/tokenizer.c:978:50: runtime error: pointer index expression with base 0x625000016900 overflowed to 0xbebebebebebee6be\n====
=============================================================\n==48535==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x606000027c51 at pc 0x55ba1fba57d4 bp 0x7fffc21fe430 sp 0x7fffc
21fe420\nREAD of size 1 at 0x606000027c51 thread T0\n    #0 0x55ba1fba57d3 in ascii_decode Objects/unicodeobject.c:4941\n    #1 0x55ba1fca4f4a in unicode_decode_utf8 Objects/unicodeobject.c:499
9\n    #2 0x55ba203fb1bd in byte_offset_to_character_offset Parser/pegen.c:148\n    #3 0x55ba203fb1bd in _PyPegen_raise_error_known_location Parser/pegen.c:412\n    #4 0x55ba203fbe4d in _PyPege
n_raise_error Parser/pegen.c:373\n    #5 0x55ba203ff981 in tokenizer_error Parser/pegen.c:321\n    #6 0x55ba203ff981 in _PyPegen_fill_token Parser/pegen.c:638\n    #7 0x55ba2040277f in _PyPegen
_expect_token Parser/pegen.c:753\n    #8 0x55ba2041317a in _tmp_15_rule Parser/parser.c:16184\n    #9 0x55ba204024f9 in _PyPegen_lookahead (/home/pablogsal/github/python/master/python+0x1c344f9
)\n    #10 0x55ba20478e2c in compound_stmt_rule Parser/parser.c:1860\n    #11 0x55ba204815c2 in statement_rule Parser/parser.c:1224\n    #12 0x55ba204815c2 in _loop1_11_rule Parser/parser.c:159
54\n    #13 0x55ba204815c2 in statements_rule Parser/parser.c:1183\n    #14 0x55ba204854b7 in file_rule Parser/parser.c:716\n    #15 0x55ba204854b7 in _PyPegen_parse Parser/parser.c:24401\n
#16 0x55ba20405768 in _PyPegen_run_parser Parser/pegen.c:1077\n    #17 0x55ba204063ef in _PyPegen_run_parser_from_file_pointer Parser/pegen.c:1137\n    #18 0x55ba1feb53c6 in PyRun_FileExFlags P
ython/pythonrun.c:1057\n    #19 0x55ba1feb572c in PyRun_SimpleFileExFlags Python/pythonrun.c:400\n    #20 0x55ba1f8cbdbb in pymain_run_file Modules/main.c:369\n    #21 0x55ba1f8cbdbb in pymain_
run_python Modules/main.c:553\n    #22 0x55ba1f8ce154 in Py_RunMain Modules/main.c:632\n    #23 0x55ba1f8ce154 in pymain_main Modules/main.c:662\n    #24 0x55ba1f8ce154 in Py_BytesMain Modules/
main.c:686\n    #25 0x7fa4b6ba7001 in __libc_start_main (/usr/lib/libc.so.6+0x27001)\n    #26 0x55ba1f8c848d in _start (/home/pablogsal/github/python/master/python+0x10fa48d)\n\n0x606000027c51
is located 0 bytes to the right of 49-byte region [0x606000027c20,0x606000027c51)\nallocated by thread T0 here:\n    #0 0x7fa4b78db459 in __interceptor_malloc /build/gcc/src/gcc/libsanitizer/as
an/asan_malloc_linux.cpp:145\n    #1 0x55ba1fbbaa1d in PyUnicode_New Objects/unicodeobject.c:1437\n    #2 0x55ba1fcee24b in _PyUnicode_Init Objects/unicodeobject.c:15535\n    #3 0x55ba1fe895c3
in pycore_init_types Python/pylifecycle.c:599\n    #4 0x55ba1fe895c3 in pycore_interp_init Python/pylifecycle.c:724\n    #5 0x55ba1fe93b4b in pyinit_config Python/pylifecycle.c:765\n    #6 0x55
ba1fe93b4b in pyinit_core Python/pylifecycle.c:926\n    #7 0x55ba1fe95b6c in Py_InitializeFromConfig Python/pylifecycle.c:1136\n    #8 0x55ba1f8c8752 in pymain_init Modules/main.c:66\n    #9 0x
55ba1f8ce10a in pymain_main Modules/main.c:653\n    #10 0x55ba1f8ce10a in Py_BytesMain Modules/main.c:686\n    #11 0x7fa4b6ba7001 in __libc_start_main (/usr/lib/libc.so.6+0x27001)\n\nSUMMARY: A
ddressSanitizer: heap-buffer-overflow Objects/unicodeobject.c:4941 in ascii_decode\nShadow bytes around the buggy address:\n  0x0c0c7fffcf30: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 00 00\n
0x0c0c7fffcf40: 00 00 00 07 fa fa fa fa 00 00 00 00 00 00 00 00\n  0x0c0c7fffcf50: fa fa fa fa 00 00 00 00 00 00 00 05 fa fa fa fa\n  0x0c0c7fffcf60: 00 00 00 00 00 00 00 00 fa fa fa fa 00 00 0
0 00\n  0x0c0c7fffcf70: 00 00 00 00 fa fa fa fa 00 00 00 00 00 00 00 01\n=>0x0c0c7fffcf80: fa fa fa fa 00 00 00 00 00 00[01]fa fa fa fa fa\n  0x0c0c7fffcf90: 00 00 00 00 00 00 00 00 fa fa fa fa
 00 00 00 00\n  0x0c0c7fffcfa0: 00 00 05 fa fa fa fa fa 00 00 00 00 00 00 00 fa\n  0x0c0c7fffcfb0: fa fa fa fa 00 00 00 00 00 00 00 00 fa fa fa fa\n  0x0c0c7fffcfc0: fd fd fd fd fd fd fd fd fa
fa fa fa fd fd fd fd\n  0x0c0c7fffcfd0: fd fd fd fd fa fa fa fa 00 00 00 00 00 00 00 fa\nShadow byte legend (one shadow byte represents 8 application bytes):\n  Addressable:           00\n  Par
tially addressable: 01 02 03 04 05 06 07 \n  Heap left redzone:       fa\n  Freed heap region:       fd\n  Stack left redzone:      f1\n  Stack mid redzone:       f2\n  Stack right redzone:
 f3\n  Stack after return:      f5\n  Stack use after scope:   f8\n  Global redzone:          f9\n  Global init order:       f6\n  Poisoned by user:        f7\n  Container overflow:      fc\n
Array cookie:            ac\n  Intra object redzone:    bb\n  ASan internal:           fe\n  Left alloca redzone:     ca\n  Right alloca redzone:    cb\n  Shadow gap:              cc\n==48535==
ABORTING\n'

test_eof failed

== Tests result: FAILURE ==

1 test failed:
    test_eof

Total duration: 355 ms
Tests result: FAILURE

~/github/python/master heads/master*
venv ❯ LSAN_OPTIONS="suppressions=asan-suppression.txt,print_suppressions=0" ./python -c '"Ṕýţĥòñ" +'
  File "<string>", line 1
    "Ṕýţĥòñ" +
              ^
SyntaxError: invalid syntax
venv ❯ LSAN_OPTIONS="suppressions=asan-suppression.txt,print_suppressions=0" ./python -c 'yield from'
  File "<string>", line 1
    yield from
              ^
SyntaxError: invalid syntax

After this PR

venv ❯ LSAN_OPTIONS="suppressions=asan-suppression.txt,print_suppressions=0" ./python -m test test_eof
0:00:00 load avg: 1.13 Run tests sequentially
0:00:00 load avg: 1.13 [1/1] test_eof

== Tests result: SUCCESS ==

1 test OK.

Total duration: 493 ms
Tests result: SUCCESS
^[[A
~/github/python/master [bpo-40958](https://bugs.python.org/issue40958)*
venv ❯ LSAN_OPTIONS="suppressions=asan-suppression.txt,print_suppressions=0" ./python -c '"Ṕýţĥòñ" +'
  File "<string>", line 1
    "Ṕýţĥòñ" +
              ^
SyntaxError: invalid syntax

~/github/python/master [bpo-40958](https://bugs.python.org/issue40958)*
venv ❯ LSAN_OPTIONS="suppressions=asan-suppression.txt,print_suppressions=0" ./python -c 'yield from'
  File "<string>", line 1
    yield from
              ^
SyntaxError: invalid syntax

@pablogsal pablogsal self-assigned this Jun 12, 2020
Copy link
Member

@tiran tiran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A test is failing:

 ======================================================================
FAIL: testSyntaxErrorOffset (test.test_exceptions.ExceptionTests) (source=b'Python = "\xcf\xb3\xf2\xee\xed" +', lineno=1, offset=18)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "D:\a\cpython\cpython\lib\test\test_exceptions.py", line 188, in check
    self.assertEqual(cm.exception.offset, offset)
AssertionError: 13 != 18

@bedevere-bot
Copy link

When you're done making the requested changes, leave the comment: I have made the requested changes; please review again.

@pablogsal
Copy link
Member Author

Closing this as this is missing something important and unfortunately, I don't have time to investigate today :(

@pablogsal pablogsal closed this Jun 12, 2020
@pablogsal
Copy link
Member Author

@lysnikolaou Feel free to investigate if you have some time

@lysnikolaou
Copy link
Contributor

@lysnikolaou Feel free to investigate if you have some time

I'll do as soon as I get home. Sorry for not being able to help out now!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants