Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse XSLT::Stylesheet using the libxslt-preferred options #1940

Closed
menafkasap opened this issue Nov 7, 2019 · 2 comments
Closed

parse XSLT::Stylesheet using the libxslt-preferred options #1940

menafkasap opened this issue Nov 7, 2019 · 2 comments
Milestone

Comments

@menafkasap
Copy link

The content in CDATA isn't affected by disable-output-escaping

Two files shared below don't produce the same output with xsltproc and nokogiri.

FILES

test.xsl

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="text" omit-xml-declaration="yes" />
    <xsl:template match="/">
        <xsl:text disable-output-escaping="yes"><![CDATA[<>]]>
</xsl:text>
    </xsl:template>
</xsl:stylesheet>

test.xml

<t></t>

test.rb

#!/usr/bin/env ruby

require 'nokogiri'

xslt  = Nokogiri::XSLT(File.read(ARGV[0]))
xml = Nokogiri::XML(File.read(ARGV[1]))

puts xslt.transform(xml)

OUTPUTS

xsltproc test.xsl test.xml

<>

./test.rb test.xsl test.xml

<?xml version="1.0"?>
&lt;&gt;
@flavorjones
Copy link
Member

@menafkasap Thanks for reporting this issue, and apologies for my slow reply.

I've tracked this down to the parse options used for the XSL file. xsltproc uses the C constant XSLT_PARSE_OPTIONS (which is NOENT | DTDLOAD | DTDATTR | NOCDATA), while Nokogiri uses XML::ParseOptions::DEFAULT_XML (which is RECOVER | NONET). Specifically, the behavioral difference you're spotting is due to the presence/absence of NOCDATA.

I think what we should be doing here is using these same options for the XSL files within Nokogiri (e.g., in the XSLT.parse method).

@flavorjones flavorjones added this to the v1.11.0 milestone Feb 26, 2020
@flavorjones flavorjones changed the title Escaping CDATA in output parse XSLT::Stylesheet using the libxslt-preferred options Feb 26, 2020
@flavorjones
Copy link
Member

PR submitted at #2221. Will land in v1.12.x.

flavorjones added a commit that referenced this issue Apr 21, 2021
…t-using-preferred-options

parse XSLT stylesheets using libxslt-preferred options

---

**What problem is this PR intended to solve?**

See #1940 where some XSL transformations ended up producing slightly different results from `xsltproc`

**Have you included adequate test coverage?**

Yes!

**Does this change affect the behavior of either the C or the Java implementations?**

This behavior appears to be libxslt-specific, and so the test only runs for CRuby; though the document ends up being parsed with the new options on both implementations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants