Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(cruby): patch libxml2 to address GNOME/libxml2#200 #2182

Merged
merged 3 commits into from Feb 6, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -9,6 +9,7 @@ Nokogiri follows [Semantic Versioning](https://semver.org/), please see the [REA
### Fixed

* [CRuby] `NodeSet` may now safely contain `Node` objects from multiple documents. Previously the GC lifecycle of the parent `Document` objects could lead to contained nodes being GCed while still in scope. [[#1952](https://github.com/sparklemotion/nokogiri/issues/1952)]
* [CRuby] Patch libxml2 to avoid "huge input lookup" errors on large CDATA elements. (See upstream [GNOME/libxml2#200](https://gitlab.gnome.org/GNOME/libxml2/-/issues/200) and [GNOME/libxml2!100](https://gitlab.gnome.org/GNOME/libxml2/-/merge_requests/100).) [[#2132](https://github.com/sparklemotion/nokogiri/issues/2132)].
* [CRuby] `{XML,HTML}::Document.parse` now invokes `#initialize` exactly once. Previously `#initialize` was invoked twice on each object.
* [JRuby] `{XML,HTML}::Document.parse` now invokes `#initialize` exactly once. Previously `#initialize` was not called, which was a problem for subclassing such as done by `Loofah`.

Expand Down
@@ -0,0 +1,70 @@
From ca565c1edef9a455453fa8564270cc9c5813e1b9 Mon Sep 17 00:00:00 2001
From: Mike Dalessio <mike.dalessio@gmail.com>
Date: Sun, 31 Jan 2021 09:53:56 -0500
Subject: [PATCH] parser.c: shrink the input buffer when appropriate

Fixes GNOME/libxml2#200

Also see discussions at:
- GNOME/libxml2#192
- https://gitlab.gnome.org/nwellnhof/libxml2/-/commit/99bda1e
- https://github.com/sparklemotion/nokogiri/issues/2132
---
parser.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/parser.c b/parser.c
index a7bdc7f..efde672 100644
--- a/parser.c
+++ b/parser.c
@@ -4204,6 +4204,7 @@ xmlParseSystemLiteral(xmlParserCtxtPtr ctxt) {
}
count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF) {
@@ -4291,6 +4292,7 @@ xmlParsePubidLiteral(xmlParserCtxtPtr ctxt) {
buf[len++] = cur;
count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF) {
@@ -4571,6 +4573,7 @@ xmlParseCharDataComplex(xmlParserCtxtPtr ctxt, int cdata) {
}
count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF)
@@ -4776,6 +4779,7 @@ xmlParseCommentComplex(xmlParserCtxtPtr ctxt, xmlChar *buf,

count++;
if (count > 50) {
+ SHRINK;
GROW;
count = 0;
if (ctxt->instate == XML_PARSER_EOF) {
@@ -5186,6 +5190,7 @@ xmlParsePI(xmlParserCtxtPtr ctxt) {
}
count++;
if (count > 50) {
+ SHRINK;
GROW;
if (ctxt->instate == XML_PARSER_EOF) {
xmlFree(buf);
@@ -9783,6 +9788,7 @@ xmlParseCDSect(xmlParserCtxtPtr ctxt) {
sl = l;
count++;
if (count > 50) {
+ SHRINK;
GROW;
if (ctxt->instate == XML_PARSER_EOF) {
xmlFree(buf);
--
2.25.1