-
-
Notifications
You must be signed in to change notification settings - Fork 913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
segfault in node_set.rb #1952
Comments
Hi, thank you for reporting this. I'll attempt to reproduce and diagnose as soon as I can (hopefully today). |
OK, I've reproduced using your script and XML. Currently trying to reproduce that while running under Valgrind to get more precise information about the operation causing the segfault. |
OK - I captured the segfault in valgrind and have a nice stack walkback. Will dig in. |
Here's the top of the valgrind output, cleaned up a bit for clarity:
I need to investigate, but essentially what this is telling me is that we still have an active pointer to memory that was |
Sorry for lack of updates - this is high priority, but I haven't been able to spend much time on it in the past week. (The difficulty in quickly/reliably reproducing it is an additional challenge.) |
I've encountered this same exact issue with nokogiri 1.10.9 and ruby 2.6.3 |
@samsonnguyen I'm sorry you're experience this issue! Are you able to easily reproduce this? It would greatly help if I had a repro that I could use to get faster feedback loops. |
@ahorek Thanks for the PR with the repro, I'll dig in. |
I have come across a segmentation fault which I think is related to this issue. It happens when I merge two nodesets like this: require "nokogiri"
XML_PAYLOAD = <<-EOF
<?xml version="1.0" encoding="UTF-8"?>
<container>
</container>
EOF
class ParseAndMerge
def merge_items
nodeset1 = Nokogiri::XML(XML_PAYLOAD).xpath("//container")
nodeset2 = Nokogiri::XML(XML_PAYLOAD).xpath("//container")
nodeset1 + nodeset2
end
end
GC.stress = true
100.times do
p = ParseAndMerge.new
merged_nodeset = p.merge_items
puts merged_nodeset.to_s
end Happens with 1.10.9, 1.10.10, 1.11.0.rc3. nokogiri -v:
It does not happen when I change above code to prevent ruby from garbage collecting the nodeset2 ( require "nokogiri"
XML_PAYLOAD = <<-EOF
<?xml version="1.0" encoding="UTF-8"?>
<container>
</container>
EOF
$nodeset_list = []
class ParseAndMerge
def merge_items
nodeset1 = Nokogiri::XML(XML_PAYLOAD).xpath("//container")
nodeset2 = Nokogiri::XML(XML_PAYLOAD).xpath("//container")
$nodeset_list << [nodeset2] # Fix: Prevent ruby from garbage collecting nodeset2
nodeset1 + nodeset2
end
end
GC.stress = true
100.times do
p = ParseAndMerge.new
merged_nodeset = p.merge_items
puts merged_nodeset.to_s
end |
@stayhero Thanks for reporting your issue. I don't have enough information yet to determine if you're seeing the same problem or not. Hoping to be able to dedicate some time to this memory issue soon. |
@stayhero Sorry for the delay in giving you a meaningful response. After digging into the example you've given, I'm confident that this is the same bug. Thanks so much for providing this second example which has helped illuminate what's going on. |
I'm going to try to explain how GC works with libxml2's data structures, what's going wrong in these examples to cause a segfault, and some ideas for resolving the problems. How GC works for libxml2 data structuresGarbage collection ("GC") of libxml2 data structures is controlled by the The Additionally, a Notably, what What's Going WrongIn these examples, we have two sets of I mentioned above that In these examples, when GC kicks in, nothing belonging to Later, we try to print out the contents of the Thoughts on ResolutionTwo solutions come to mind:
The disadvantages of option (2) are:
which leads me to conclude that option (1) is the better choice at this time. And a Crazy IdeaHowever, there is a third option which I'd like to explore:
which would greatly simplify the implementation, but also bring with it the disadvantages of option (2) above. Next StepsI'm going to implement (1), but also create a separate issue to drive exploration of (3). |
I've opened an issue for exploration of the crazy idea at #2184. I'll work on a less risky fix for this bug this week. |
Blurgh, option (1) above breaks the semantics of the NodeSet in that the duplicate we'd create would be unparented from the original document -- which is not going to make any sense for users who end up using NodeSets to collect the results of multiple queries across documents. I think we need to implement option (2) and accept a slightly slower mark phase. On the plus side, this is exactly what NodeSet-as-subclass-of-Array would do if we decide to eventually reimplement the class. |
Specifically, this means we've added a mark callback for the NodeSet which marks each of the contained objects. Previously we skipped explicitly marking the contained objects due to the assumption that all the nodes would be from the same document as the NodeSet itself. Fixes #1952.
PR created at #2186. |
Specifically, this means we've added a mark callback for the NodeSet which marks each of the contained objects. Previously we skipped explicitly marking the contained objects due to the assumption that all the nodes would be from the same document as the NodeSet itself. Fixes #1952.
Specifically, this means we've added a mark callback for the NodeSet which marks each of the contained objects. Previously we skipped explicitly marking the contained objects due to the assumption that all the nodes would be from the same document as the NodeSet itself. Fixes #1952.
Since fixing #1952 I've wanted to revisit the valgrind suppressions to see what's left. These suppressions represent what I saw in docker images on my dev machine: - on Ruby 2.7 and 3.0 startup (iseq_peephole_optimize) - enumerators seem to confuse Valgrind (mark_locations_array/gc_mark_stacked_objects) ci: add 2.7 valgrind suppressions to ignore enumerator warnings squash
Since fixing #1952 I've wanted to revisit the valgrind suppressions to see what's left. These suppressions represent what I saw in docker images on my dev machine: - on Ruby 2.7 and 3.0 startup (iseq_peephole_optimize) - enumerators seem to confuse Valgrind (mark_locations_array/gc_mark_stacked_objects)
Since fixing #1952 I've wanted to revisit the valgrind suppressions to see what's left. These suppressions represent what I saw in docker images on my dev machine: - on Ruby 2.7 and 3.0 startup (iseq_peephole_optimize) - enumerators seem to confuse Valgrind (mark_locations_array/gc_mark_stacked_objects)
Since fixing #1952 I've wanted to revisit the valgrind suppressions to see what's left. These suppressions represent what I saw in docker images on my dev machine: - on Ruby 2.7 and 3.0 startup (iseq_peephole_optimize) - enumerators seem to confuse Valgrind (mark_locations_array/gc_mark_stacked_objects)
Since fixing #1952 I've wanted to revisit the valgrind suppressions to see what's left. I'm leaving only the suppressions I see for Ruby 2.7 and 3.0 startup (iseq_peephole_optimize).
…or-in-valgrind ci: skip Nodeset enumerator test in valgrind --- Since fixing #1952 I've wanted to revisit the valgrind suppressions to see what's left. I'm leaving only the suppressions I see for Ruby 2.7 and 3.0 startup (iseq_peephole_optimize), and I'm skipping the NodeSet enumerator test since I know it confuses valgrind.
I'm probably overlooking a few things, but here's what I think is a minimal approach.
There's no need for a node cache or to create these Ruby references. When marking a node it should be enough to mark the Ruby objects for the node's
This seems like the best solution to me. |
@nwellnhof Totally agree, right now the codebase reflects our memory management approach from ~2008 when we didn't have a good grasp of libxml2's behaviors. I'd like to implement something like what you're describing, and that the libxml-ruby gem introduced in xml4r/libxml-ruby@c41f577. Also see my notes in #2822. Just a matter of finding the time to do the work. |
Describe the bug
To Reproduce
xml input
https://github.com/ahorek/nokogiri/blob/bug/bug/xml.xml
Expected behavior
no segfaults
Environment
latest nokogiri-1.10.5
ruby 2.7.0dev (2019-11-28T14:49:28Z trunk b5fbefbf2c) [x86_64-linux]
tested on multiple ruby versions 2.5 - 2.7, linux / windows
the script doesn't fail on jruby
The text was updated successfully, but these errors were encountered: