Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dom4j 2.1.1 has regression and is stripping entities #51

Open
vmassol opened this issue Oct 9, 2018 · 9 comments
Open

Dom4j 2.1.1 has regression and is stripping entities #51

vmassol opened this issue Oct 9, 2018 · 9 comments
Assignees
Labels
Milestone

Comments

@vmassol
Copy link

vmassol commented Oct 9, 2018

Hi there,

On the XWiki project we upgraded from dom4J 2.1.0 to 2.1.1 and found some regressions: dom4j is now stripping our entities from the XML.

FTR here's the jira issue on the XWiki side: https://jira.xwiki.org/browse/XWIKI-15694

Initially we thought the problem was coming from our upgrade of CSS4J and after debugging with CSS4J's author, we found that the issue is actually with dom4J 2.1.1, see https://groups.google.com/forum/#!topic/css4j/HlnlB2c8G4I

To reproduce, see the tests at https://groups.google.com/d/msg/css4j/HlnlB2c8G4I/FFGDUyVDAwAJ

Thanks

@carlosame
Copy link
Contributor

To summarize what's at my post in the css4j forum:

With dom4j 2.1.1:

  1. Removes entities that are in "ENTITIES Latin 1" but not entities in e.g. "ENTITIES Special". If you test with > or < it works.
  2. EntityResolver.resolveEntity is NOT being called.

dom4j 2.1.0 (I assume prior versions have similar behaviour):

  1. Unable to reproduce the issue.
  2. EntityResolver.resolveEntity is being called.

I added two tests and 2.1.1 is failing one of them. The easiest way to run them is to do a Maven build from git, there are instructions to do that at https://carte.sourceforge.io/css4j/.

@FilipJirsak FilipJirsak self-assigned this Oct 9, 2018
@FilipJirsak FilipJirsak added the bug label Oct 9, 2018
@FilipJirsak FilipJirsak added this to the 2.1.2 milestone Oct 9, 2018
@vmassol
Copy link
Author

vmassol commented Oct 9, 2018

So it seems dom4j has changed the default behavior, see #28

Basically they did this:

reader.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
reader.setFeature("http://xml.org/sax/features/external-general-entities", false);
reader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);

So I think we now either need to override the features in XWiki or it should be done in CSS4J.

I'll test this more over the coming days.

@FilipJirsak is that correct?

@FilipJirsak
Copy link
Contributor

@vmassol Default behavior changed only for DocumentHelper.parseText(). I think this issue is related more likely with #44, but the only difference I see in the debugger is that feature http://apache.org/xml/features/nonvalidating/load-external-dtd is set to true with dom4j-2.1.0 but to falsewith dom4j-2.1.1. But when I set it to true with dom4j-2.1.1 EntityResolver.resolveEntity is not called either.

@vmassol
Copy link
Author

vmassol commented Oct 10, 2018

ok thanks @FilipJirsak for the info. I'll let you debug it then :) Let me know if you need my help for anything and thanks very much for looking into this.

@Enygma2002
Copy link

Enygma2002 commented May 7, 2019

Hi. Any progress on this? Looks like quite a big problem...

@carlosame
Copy link
Contributor

I was trying to look at this issue more closely and came up to commit 53f923a, where at lines 111 and 112 of the SAXHelper file you can read this:

        SAXHelper.setParserFeature(reader, "http://xml.org/sax/properties/external-general-entities", false);
        SAXHelper.setParserFeature(reader, "http://xml.org/sax/properties/external-parameter-entities", false);

Those lines have little effect, as the names of the features aren't those: they are http://xml.org/sax/features/external-general-entities and http://xml.org/sax/features/external-parameter-entities, note the switch /properties/features/ (see https://xerces.apache.org/xerces-j/features.html ).

So the patch for #28 is unlikely to be the cause of the problem, as it does nothing.

@carlosame
Copy link
Contributor

The problem is due to #44, specifically line 115 in SAXHelper:

// external DTD
SAXHelper.setParserFeature(reader,"http://apache.org/xml/features/nonvalidating/load-external-dtd", false);

IMHO that line should not be there, as load-external-dtd: false is somewhat counter-intuitive as a default.

@vmassol
Copy link
Author

vmassol commented May 20, 2019

Thanks a lot @carlosame for digging into this. Let's hope that the Dom4J devs can fix this so that we can upgrade the Dom4J version used in XWiki ;) (we'll try the workaround suggested by @carlosame in the meantime but we'd rather get rid of it ASAP). Thanks :)

@vmassol
Copy link
Author

vmassol commented Dec 5, 2019

@FilipJirsak Hi. Any idea how to progress on this? Thx

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Development

No branches or pull requests

4 participants