Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] Document / Nodes aren't compaction friendly #2578

Closed
tenderlove opened this issue Jun 14, 2022 · 6 comments · Fixed by #2579
Closed

[bug] Document / Nodes aren't compaction friendly #2578

tenderlove opened this issue Jun 14, 2022 · 6 comments · Fixed by #2579
Labels
topic/memory Segfaults, memory leaks, valgrind testing, etc.

Comments

@tenderlove
Copy link
Member

Please describe the bug

xmlNodePtr instances keep a reference to their Ruby wrapper inside the _private member. You can see the _private member being set to the wrapping Ruby instance here.

After it's been wrapped, the instance is optionally stored in an array that is kept on the associated document. You can see the array push here.

Ruby arrays are compaction friendly. When the compactor runs, elements inside the array are allowed to move because the array knows how to update its own references. However, Nokogiri's XML nodes are not compaction friendly. They don't know to update their references. Normally this isn't a problem because objects are required to call rb_gc_mark on their references, and rb_gc_mark will pin the references. The xmlNodePtr instance references the Ruby wrapper, but since the Ruby wrapper is allowed to move, the _private pointer can go bad when compaction runs.

Help us reproduce what you're seeing

require "nokogiri"

big_doc = "<root>#{("a".."zz").map { |x| "<#{x}>#{x}</#{x}>" }.join}</root>"
doc = Nokogiri.XML(big_doc)
doc.root.children.each do |child|
  child
end

GC.verify_compaction_references(double_heap: true, toward: :empty)

doc.root.children.each do |child|
  p child
end

This script will SEGV because the node objects move, and then we try to reference them.

Expected behavior

It shouldn't segv.

Possible Fixes

I'm not 100% sure how we should go about fixing this. This patch fixes the above code:

diff --git a/ext/nokogiri/xml_document.c b/ext/nokogiri/xml_document.c
index 134931de..31ddc24d 100644
--- a/ext/nokogiri/xml_document.c
+++ b/ext/nokogiri/xml_document.c
@@ -57,6 +57,9 @@ mark(xmlDocPtr doc)
   if (tuple) {
     rb_gc_mark(tuple->doc);
     rb_gc_mark(tuple->node_cache);
+    for (long i = 0; i < RARRAY_LEN(tuple->node_cache); i++) {
+        rb_gc_mark(rb_ary_entry(tuple->node_cache, i));
+    }
   }
 }
 

But I don't like this patch. It means we iterate the array too many times.

I think a better solution is probably to teach Node instances about compaction so that they can update themselves, but that might be a more involved patch.

@tenderlove tenderlove added the state/needs-triage Inbox for non-installation-related bug reports or help requests label Jun 14, 2022
@flavorjones flavorjones added topic/memory Segfaults, memory leaks, valgrind testing, etc. and removed state/needs-triage Inbox for non-installation-related bug reports or help requests labels Jun 15, 2022
@flavorjones
Copy link
Member

Ack. I've reproduced. Let's talk tomorrow.

@tenderlove
Copy link
Member Author

Better patch:

diff --git a/ext/nokogiri/xml_node.c b/ext/nokogiri/xml_node.c
index 14f1a871..aa2c66cb 100644
--- a/ext/nokogiri/xml_node.c
+++ b/ext/nokogiri/xml_node.c
@@ -24,6 +24,20 @@ static void
 _xml_node_mark(xmlNodePtr node)
 {
   xmlDocPtr doc = node->doc;
+
+  // Mark this node's wrapper object.  This is very silly because the mark
+  // callback is only firing because this node has been marked.  Marking
+  // ourselves won't cause an infinite loop (the GC handles that), but the
+  // reason we're calling `rb_gc_mark` is to pin the wrapper object.
+  // These nodes are normally marked via an Array stored on the document,
+  // and since Ruby Arrays allow their contents to move, then the wrapper
+  // object was allowed to move too.  That means the reference in `_private`
+  // could go bad (since we don't update references yet)
+  // Calling `rb_gc_mark` here will pin the wrapper object.  Though really
+  // we should update this object to support compaction and update references
+  // accordingly.
+  rb_gc_mark(node->_private);
+
   if (doc->type == XML_DOCUMENT_NODE || doc->type == XML_HTML_DOCUMENT_NODE) {
     if (DOC_RUBY_OBJECT_TEST(doc)) {
       rb_gc_mark(DOC_RUBY_OBJECT(doc));

Still not ideal, but better than the patch above. I think ideally we should add compaction support. The first step toward that would be switching from DataWrapStruct to the typed version so we can provide a compaction callback.

@bf4
Copy link

bf4 commented Jul 19, 2022

I've been able to reliably track the 1.13.6 -> 1.13.7 update to a segfault whilst using creek 2.5.3. (We're on CRuby 3.0.3 and Rails 7.0.3. Reproduced on linux and osx). I haven't been able to reproduce it reliably on every run, unfortunately, and the trace is a little different every time. Would you like me to make an issue? Otherwise, I'm just sticking this comment here in case it helps a future googler.

@flavorjones
Copy link
Member

@bf4 Yes, please open an issue. I also commented on the upstream issue.

@flavorjones
Copy link
Member

@bf4 a script that will reproduce this, even if intermittently, would be extremely valuable. I know nothing about Creek.

@whatisgaven

This comment was marked as spam.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic/memory Segfaults, memory leaks, valgrind testing, etc.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants