Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jruby test failures #88

Open
flavorjones opened this issue May 5, 2015 · 18 comments
Open

jruby test failures #88

flavorjones opened this issue May 5, 2015 · 18 comments

Comments

@flavorjones
Copy link
Owner

So many test failures. What is going on?

@jkraemer
Copy link

Which basically blocks the use of Rails 4.2 with JRuby, at least if you are concerned about working html sanitization.

Most if not all of the problems seem to be Nokogiri-Java problems where it behaves different from C-Nokogiri. I started working on some Nokogiri test cases that reproduce what loofah does, see sparklemotion/nokogiri#1318, sparklemotion/nokogiri#1319 and sparklemotion/nokogiri#1320. At least the last one looks more like a Xalan-J bug...

@marutosi
Copy link

@flavorjones
Copy link
Owner Author

@headius
Copy link

headius commented Nov 12, 2017

Ping! Trying to get JRuby deps cleaned up again now that we support Rails 5.

@flavorjones
Copy link
Owner Author

@headius We (nokogiri core) need help on the JRuby implementation. I've pinged you and other members of the JRuby community a few times since November 2014 on the subject, so hopefully that's not surprising news.

Taking a look at the most recent failing pipeline:

https://ci.nokogiri.org/teams/nokogiri-core/pipelines/loofah/jobs/jruby-9.1/builds/12

most of these failures are what I'd refer to as "soft" failures, meaning that the markup is sanitized properly, but there are minor formatting differences that can be attributed to small differences in the parsing engines.

There are a few, however, that are NPEs, and for those I'd welcome PRs to Nokogiri to fix them.

Additionally, there are about 30 open issues for Nokogiri that are labeled jruby, and I'd welcome help with those as well.

@dometto
Copy link

dometto commented Oct 10, 2019

I see there are still quite some Jruby test failures. @headius @flavorjones: is Loofah currently safe to use on JRuby?

(Context: we are looking into replacing Sanitize with Loofah in gollum due to lack of JRuby support in newer versions of Sanitize.)

@dometto
Copy link

dometto commented Oct 19, 2019

Ping @flavorjones

@flavorjones
Copy link
Owner Author

@dometto If you're intent on using Loofah on JRuby, can I ask for your help investigating some of the errors being raised by Nokogiri's JRuby implementation?

@flavorjones
Copy link
Owner Author

Specifically, some errors are due to the differences in behavior between libxml2 and xerces/nekohtml. But others are indicative of issues with the JRuby implementation of Nokogiri. Differences in parser behavior are likely "safe" for some definition of safe. Errors, though, are potential Denial-of-Service vectors, and so I'm hesitant to describe Loofah as "safe" when run on JRuby.

For what it's worth, we have far fewer errors now than in 2017 thanks to the efforts of @kares, @jvshahid, and others on the Nokogiri JRuby implementation.

Here are the nine errors I can clearly identify as part of this category:

test_0001_only_scrub_subtree(Document::Node::#scrub!):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::Element:0x49ede9c7>
    /home/flavorjones/code/oss/loofah/test/integration/test_scrubbers.rb:179:in `block in test_0001_only_scrub_subtree'

test_0001_only_scrub_subtrees(Document::NodeSet::#scrub!):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::NodeSet:0x741f8dbe>
    /home/flavorjones/code/oss/loofah/test/integration/test_scrubbers.rb:204:in `block in test_0001_only_scrub_subtrees'

test_0007_scrubs_document_nodes(HTML):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::Element:0x2af69643>
    /home/flavorjones/code/oss/loofah/test/unit/test_api.rb:43:in `block in test_0007_scrubs_document_nodes'

test_0009_scrubs_document_nodesets(HTML):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::NodeSet:0x210308d5>
    /home/flavorjones/code/oss/loofah/test/unit/test_api.rb:56:in `block in test_0009_scrubs_document_nodesets'

test_should_allow_xml:lang_attribute(Html5TestSanitizer):
Java::JavaLang::NullPointerException: 
    nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
    nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
    nokogiri.XmlNode.setNode(XmlNode.java:583)
    nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
    nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
    nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
    nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
    org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...

test_should_allow_xml:base_attribute(Html5TestSanitizer):
Java::JavaLang::NullPointerException: 
    nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
    nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
    nokogiri.XmlNode.setNode(XmlNode.java:583)
    nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
    nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
    nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
    nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
    org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...

test_0059_testdata_sanitizer_xml_base(Html5TestSanitizer):
Java::JavaLang::NullPointerException: 
    nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
    nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
    nokogiri.XmlNode.setNode(XmlNode.java:583)
    nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
    nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
    nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
    nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
    org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...

test_should_allow_xml:space_attribute(Html5TestSanitizer):
Java::JavaLang::NullPointerException: 
    nokogiri.XmlNamespace.createDefaultNamespace(XmlNamespace.java:167)
    nokogiri.XmlAttr.setNamespaceIfNecessary(XmlAttr.java:106)
    nokogiri.XmlNode.setNode(XmlNode.java:583)
    nokogiri.internals.NokogiriHelpers.constructNode(NokogiriHelpers.java:147)
    nokogiri.internals.NokogiriHelpers.getCachedNodeOrCreate(NokogiriHelpers.java:126)
    nokogiri.XmlNode.attribute_nodes(XmlNode.java:690)
    nokogiri.XmlNode$INVOKER$i$0$0$attribute_nodes.call(XmlNode$INVOKER$i$0$0$attribute_nodes.gen)
    org.jruby.runtime.callsite.CachingCallSite.cacheAndCall(CachingCallSite.java:325)
...

test_0007_scrubs_document_nodes(XML):
NoMethodError: undefined method `scrub!' for #<Nokogiri::XML::Element:0x1b1e1f02>
    /home/flavorjones/code/oss/loofah/test/unit/test_api.rb:106:in `block in test_0007_scrubs_document_nodes'

@kares
Copy link

kares commented Oct 30, 2019

haven't tested but the NPEs should be fixed with #88 (pretty much a 'bug' in Nokogiri ext code)

UPDATE: confirmed - the Nokogiri PR #88 resolves the NPEs (4 less test failures in loofah's suite)

@dometto
Copy link

dometto commented Dec 15, 2019

UPDATE: confirmed - the Nokogiri PR #88 resolves the NPEs (4 less test failures in loofah's suite)

@kares: that links to this issue, not to a Nokogiri PR. Has the relevant PR been merged yet?

Sorry for the slow reply, but I would be happy to look into the remaining failures (NoMethodError: undefined method scrub!'`). Or is someone else tackling that already in the mean time?

@kares
Copy link

kares commented Dec 15, 2019

yy - merged but not released yet.
however you could build the nokogiri gem from master, haven't looked the scrub issue but I would check how it behaves against recent changes.

@headius
Copy link

headius commented Jan 13, 2020

Can we get a Nokogiri release soon?

@flavorjones
Copy link
Owner Author

flavorjones commented Jan 14, 2020 via email

@headius
Copy link

headius commented Jan 14, 2020

@flavorjones Oh I see. I will have a look at the milestone and see if I can help.

@headius
Copy link

headius commented Jan 14, 2020

@flavorjones I made some progress on using jar-dependencies (sparklemotion/nokogiri#1253, prerequisite for sparklemotion/nokogiri#1395) but need some assistance from @mkristian. I provided two possible patches for sparklemotion/nokogiri#1836.

I can certainly try to help with other things not marked as JRuby issues, but my capabilities will be more limited. 😀

flavorjones added a commit to sparklemotion/nokogiri that referenced this issue Jan 15, 2021
This was noticed while debugging some Loofah behavior that relied on
overriding `#initialize` to decorate nodes.

Related to flavorjones/loofah#88
flavorjones added a commit to sparklemotion/nokogiri that referenced this issue Jan 17, 2021
This was noticed while debugging some Loofah behavior that relied on
overriding `#initialize` to decorate nodes.

Related to flavorjones/loofah#88
flavorjones added a commit to sparklemotion/nokogiri that referenced this issue Jan 17, 2021
…rators

fix: Document#initialize should be called exactly once

---

**What problem is this PR intended to solve?**

Originally this PR was started to address errors in Loofah's test suite on JRuby (see flavorjones/loofah#88) related to Nokogiri object decorators not being applied correctly. The root cause of these errors was that `{XML,HTML}Document#initialize` was not being called in the JRuby implementation. Surprising! And breaks the subclassing behavior that Loofah relies on.

As I erected tests in Nokogiri's suite to make this failure obvious, I uncovered the fact that in CRuby, `Document#initialize` was actually being called twice from the `.parse` method. Even more surprising! But doesn't obviously break anything.

This PR addresses both of these issues, with the result that `Document#initialize` is called exactly once on all platforms.


**Have you included adequate test coverage?**

Yes! Thorough testing is introduced around subclassing `XML::Document`, `HTML::Document`, `XML::DocumentFragment`, and `HTML::DocumentFragment` constructor calls `.new` and `.parse`.


**Does this change affect the behavior of either the C or the Java implementations?**

Yes, but as noted above the changed behavior is now correct and consistent across the platforms.
@flavorjones
Copy link
Owner Author

Quite a few more tests will pass once the next Nokogiri release drops with sparklemotion/nokogiri#2174. Failures at that point should mostly be actual differences in parser behavior.

@flavorjones
Copy link
Owner Author

Note that the branch at #239 allows us to extend the test suite to add the JRuby output as valid test data; all that would need to be done is to go through the failing tests and determine if the output is sanitized and if so, add it to the test suite.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants