Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump jsoup from 1.13.1 to 1.14.1 #487

Closed
wants to merge 1 commit into from

Conversation

dependabot[bot]
Copy link
Contributor

@dependabot dependabot bot commented on behalf of github Aug 1, 2021

Bumps jsoup from 1.13.1 to 1.14.1.

Release notes

Sourced from jsoup's releases.

jsoup 1.14.1

jsoup 1.14.1 is out now, with simple request session management, increased parse robustness, and a ton of other improvements, speed-ups, and bug fixes.

See the full announcement for all the details on what's changed.

Changelog

Sourced from jsoup's changelog.

jsoup changelog

*** Release 1.14.2 [PENDING]

  • Improvement: support Pattern.quote \Q and \E escapes in the selector regex matchers. jhy/jsoup#1536

  • Bugfix: the *|el wildcard namespace selector now also matches elements with no namespace. jhy/jsoup#1565

  • Bugfix: corrected a potential case of the parser input stream not being closed immediately on a read exception.

  • Bugfix: when making a HTTP POST, if the request write fails, make sure the connection is immediately cleaned up.

  • Bugfix: in the XML parser, XML processing instructions without attributes would be serialized as if they did. jhy/jsoup#770

  • Bugfix: updated the HtmlTreeParser resetInsertionMode to the current spec for supported elements jhy/jsoup#1491

  • Bugfix [Fuzz]: fixed a slow parse when a tag or an attribute name has thousands of null characters in it. jhy/jsoup#1580

  • Bugfix [Fuzz]: the adoption agency algorithm can have an incorrect bookmark position jhy/jsoup#1576

  • Bugfix [Fuzz]: malformed HTML could result in null elements on stack jhy/jsoup#1579

  • Bugfix [Fuzz]: malformed deeply nested table elements could create a stack overflow. jhy/jsoup#1577

  • Bugfix [Fuzz]: Speed optimized malformed HTML creating elements with thousands of elements - limit the attribute count per element when parsing to 512 (in real-world HTML, P99 is ~ 8). jhy/jsoup#1578

  • Bugfix [Fuzz]: Speed improvement for the foster formatting elements algo, by limiting how far up a crafted stack to scan. jhy/jsoup#1593

  • Bugfix [Fuzz]: Speed improvement when parsing crafted HTML when transferring form attributes. jhy/jsoup#1595

  • Bugfix [Fuzz]: Speed improvement when the stack was thousands of items deep, and non-matching close tags sent. jhy/jsoup#1596

*** Release 1.14.1 [2021-Jul-10]

  • Change: updated the minimum supported Java version from Java 7 to Java 8.

  • Change: updated the minimum Android API level from 8 to 10.

... (truncated)

Commits

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [jsoup](https://github.com/jhy/jsoup) from 1.13.1 to 1.14.1.
- [Release notes](https://github.com/jhy/jsoup/releases)
- [Changelog](https://github.com/jhy/jsoup/blob/master/CHANGES)
- [Commits](jhy/jsoup@jsoup-1.13.1...jsoup-1.14.1)

---
updated-dependencies:
- dependency-name: org.jsoup:jsoup
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added the dependencies Pull requests that update a dependency file label Aug 1, 2021
@hazendaz
Copy link
Member

hazendaz commented Aug 1, 2021

needs some work...looks like lots of bug fixes in this release so this will need to be addressed locally after reviewing what is going on. I'll take a look at it.

@hazendaz
Copy link
Member

hazendaz commented Aug 1, 2021

Its not properly formatting now. Its messing up section spacing throughout the sample file. Items such as and some lists are not indented and most items now have extra space at end after >. As it appears they already have another in flight fixing many other things, maybe its just worth waiting. There are already issues with the current one, I've seen it completely delete sections at larger scale (one of whish is listed fixed in this set). It sort of feals like the parser needs extra help. Using eclipse to format the sections it fails on does fix the formatting :(

@hazendaz hazendaz closed this Aug 1, 2021
@dependabot @github
Copy link
Contributor Author

dependabot bot commented on behalf of github Aug 1, 2021

OK, I won't notify you again about this release, but will get in touch when a new version is available. If you'd rather skip all updates until the next major or minor version, let me know by commenting @dependabot ignore this major version or @dependabot ignore this minor version. You can also ignore all major, minor, or patch releases for a dependency by adding an ignore condition with the desired update_types to your config file.

If you change your mind, just re-open this PR and I'll resolve any conflicts on it.

@dependabot dependabot bot deleted the dependabot/maven/org.jsoup-jsoup-1.14.1 branch August 1, 2021 23:24
@hazendaz
Copy link
Member

hazendaz commented Oct 15, 2021

@ctubbsii I have diagnosed this and its a bug in my opinion in jsoup. A bug in so much that it breaks pretty formatting. They made a deep internal change to limit whitespace at 30 characters. See my diagnostics here jhy/jsoup#1653. I'm hoping they will be willing to allow exposure of that specific value so we can disable it for our case but also allow something a little more flexible than 30 characters. Once I tested that way with our latest master, it worked out really well. There is still one minor bug but just a one liner issue. That itself is acceptable by me temporarily but the other definitely is not. My hope is this will be something they accept and I'll write it for them if needed as we need off their vulnerable version.

While doing this, I discovered a number of things wrong in our current code base around jsoup. We still had a bit of that left over from when we moved xml to eclipse way of doing things (xml-formatter). And we had a number of extra property files in test resources not used at all for a few different configurations so I have that all cleaned up a well. If you could, hang off release for eclipse 2021-09 since its not super pressing and I think we can get this done here. Worst case, I'll just release my fork that fixes the issue we have and support it separately (not ideal but only other tradeoff if we end up in that situation).

One other thing, the scale I have this at work running, 500+ repos potentially, we are finding the html formatting when javascript is embedded in some cases is getting deleted. Clearly that is jsoup. So even after this there may be more to go but at least I think I'm on the right path on this. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dependencies Pull requests that update a dependency file
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant