Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

35975: jsoup:HtmlFuzzer: Uncaught exception in org.jsoup.parser.HtmlTreeBuilder.process #1577

Closed
jhy opened this issue Jul 11, 2021 · 4 comments
Assignees
Labels
fixed fuzz An issue found by the OSS Fuzz project
Milestone

Comments

@jhy
Copy link
Owner

jhy commented Jul 11, 2021

https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=35975
Detailed Report: https://oss-fuzz.com/testcase?key=4820007715471360

Project: jsoup
Fuzzing Engine: libFuzzer
Fuzz Target: HtmlFuzzer
Job Type: libfuzzer_asan_jsoup
Platform Id: linux

Crash Type: Uncaught exception
Crash Address:
Crash State:
org.jsoup.parser.HtmlTreeBuilder.process
org.jsoup.parser.HtmlTreeBuilderState$9.process
java.base/java.lang.String.compareTo

Sanitizer: address (ASAN)

Recommended Security Severity: Low

Crash Revision: https://oss-fuzz.com/revisions?job=libfuzzer_asan_jsoup&revision=202107050606

Reproducer Testcase: https://oss-fuzz.com/download?testcase_id=4820007715471360

Issue filed automatically.

@jhy jhy added the fuzz An issue found by the OSS Fuzz project label Jul 11, 2021
jhy added a commit that referenced this issue Jul 11, 2021
Doesn't repro for me - fixed by another commit?
@jhy
Copy link
Owner Author

jhy commented Jul 11, 2021

Haven't been able to repro - I will wait for the next execution of the fuzzer with updated fixes and review.

@jhy jhy added the no-repro label Jul 11, 2021
@jhy
Copy link
Owner Author

jhy commented Jul 12, 2021

Shows as still reproducing, but I can't repro with the attached sample. Thinking that it might be a character encoding issue, I tried every charset on my platform but still could not repro. I'm not sure what I'm missing.

Trying Big5
Trying Big5-HKSCS
Trying CESU-8
Trying EUC-JP
Trying EUC-KR
Trying GB18030
Trying GB2312
Trying GBK
Trying IBM-Thai
Trying IBM00858
Trying IBM01140
Trying IBM01141
Trying IBM01142
Trying IBM01143
Trying IBM01144
Trying IBM01145
Trying IBM01146
Trying IBM01147
Trying IBM01148
Trying IBM01149
Trying IBM037
Trying IBM1026
Trying IBM1047
Trying IBM273
Trying IBM277
Trying IBM278
Trying IBM280
Trying IBM284
Trying IBM285
Trying IBM290
Trying IBM297
Trying IBM420
Trying IBM424
Trying IBM437
Trying IBM500
Trying IBM775
Trying IBM850
Trying IBM852
Trying IBM855
Trying IBM857
Trying IBM860
Trying IBM861
Trying IBM862
Trying IBM863
Trying IBM864
Trying IBM865
Trying IBM866
Trying IBM868
Trying IBM869
Trying IBM870
Trying IBM871
Trying IBM918
Trying ISO-2022-CN
Trying ISO-2022-JP
Trying ISO-2022-JP-2
Trying ISO-2022-KR
Trying ISO-8859-1
Trying ISO-8859-13
Trying ISO-8859-15
Trying ISO-8859-16
Trying ISO-8859-2
Trying ISO-8859-3
Trying ISO-8859-4
Trying ISO-8859-5
Trying ISO-8859-6
Trying ISO-8859-7
Trying ISO-8859-8
Trying ISO-8859-9
Trying JIS_X0201
Trying JIS_X0212-1990
Trying KOI8-R
Trying KOI8-U
Trying Shift_JIS
Trying TIS-620
Trying US-ASCII
Trying UTF-16
Trying UTF-16BE
Trying UTF-16LE
Trying UTF-32
Trying UTF-32BE
Trying UTF-32LE
Trying UTF-8
Trying windows-1250
Trying windows-1251
Trying windows-1252
Trying windows-1253
Trying windows-1254
Trying windows-1255
Trying windows-1256
Trying windows-1257
Trying windows-1258
Trying windows-31j
Trying x-Big5-HKSCS-2001
Trying x-Big5-Solaris
Trying x-euc-jp-linux
Trying x-EUC-TW
Trying x-eucJP-Open
Trying x-IBM1006
Trying x-IBM1025
Trying x-IBM1046
Trying x-IBM1097
Trying x-IBM1098
Trying x-IBM1112
Trying x-IBM1122
Trying x-IBM1123
Trying x-IBM1124
Trying x-IBM1129
Trying x-IBM1166
Trying x-IBM1364
Trying x-IBM1381
Trying x-IBM1383
Trying x-IBM29626C
Trying x-IBM300
Trying x-IBM33722
Trying x-IBM737
Trying x-IBM833
Trying x-IBM834
Trying x-IBM856
Trying x-IBM874
Trying x-IBM875
Trying x-IBM921
Trying x-IBM922
Trying x-IBM930
Trying x-IBM933
Trying x-IBM935
Trying x-IBM937
Trying x-IBM939
Trying x-IBM942
Trying x-IBM942C
Trying x-IBM943
Trying x-IBM943C
Trying x-IBM948
Trying x-IBM949
Trying x-IBM949C
Trying x-IBM950
Trying x-IBM964
Trying x-IBM970
Trying x-ISCII91
Trying x-ISO-2022-CN-CNS
Trying x-ISO-2022-CN-GB
Trying x-iso-8859-11
Trying x-JIS0208
Trying x-JISAutoDetect
Trying x-Johab
Trying x-MacArabic
Trying x-MacCentralEurope
Trying x-MacCroatian
Trying x-MacCyrillic
Trying x-MacDingbat
Trying x-MacGreek
Trying x-MacHebrew
Trying x-MacIceland
Trying x-MacRoman
Trying x-MacRomania
Trying x-MacSymbol
Trying x-MacThai
Trying x-MacTurkish
Trying x-MacUkraine
Trying x-MS932_0213
Trying x-MS950-HKSCS
Trying x-MS950-HKSCS-XP
Trying x-mswin-936
Trying x-PCK
Trying x-SJIS_0213
Trying x-UTF-16LE-BOM
Trying X-UTF-32BE-BOM
Trying X-UTF-32LE-BOM
Trying x-windows-50220
Trying x-windows-50221
Trying x-windows-874
Trying x-windows-949
Trying x-windows-950
Trying x-windows-iso2022jp

@jhy
Copy link
Owner Author

jhy commented Jul 12, 2021

OK got it! By dropping my stack size way down (256K hits it). I guess the fuzzer is optimizing the sample data for its specific stack size :)

@jhy jhy self-assigned this Jul 12, 2021
@jhy jhy added this to the 1.14.2 milestone Jul 12, 2021
@jhy jhy added bug Confirmed bug that we should fix and removed no-repro labels Jul 12, 2021
@jhy jhy closed this as completed in 6b04287 Jul 12, 2021
@jhy
Copy link
Owner Author

jhy commented Jul 12, 2021

Fixed -- I think this is a gap in the spec? AFAICT am processing according to the rules in https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-intable - but there exists states where the resetInsertionMode will leave in the InTable state, and so reprocessing the current token will recurse and eventually overflow. Fixed by testing that the mode changed and if not, inserting directly.

Also fixed up a couple other processing states.

@jhy jhy added fixed and removed bug Confirmed bug that we should fix labels Jul 12, 2021
fmeum added a commit to CodeIntelligenceTesting/jazzer that referenced this issue Jul 22, 2021
fmeum added a commit to CodeIntelligenceTesting/jazzer that referenced this issue Jul 22, 2021
fmeum added a commit to CodeIntelligenceTesting/jazzer that referenced this issue Jul 23, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fixed fuzz An issue found by the OSS Fuzz project
Projects
None yet
Development

No branches or pull requests

1 participant