Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to ensure well-formed HTML5? #1243

Closed
tkrotoff opened this issue Feb 16, 2015 · 1 comment
Closed

How to ensure well-formed HTML5? #1243

tkrotoff opened this issue Feb 16, 2015 · 1 comment

Comments

@tkrotoff
Copy link

HTML5 allows the integration of SVG and MathML and this seems to be unsupported by Nokogiri:

require 'nokogiri'

html = <<-EOXML
<!DOCTYPE html>
<html>
  <head>
    <title>Test</title>
  </head>
  <body>
    <a title="RSS" href="/feed.xml">
      <svg class="icon icon-rss" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 8 8">
        <path class="background" fill="#f28c36"
              d="M 1.5,0 C 0.669,0 0,0.669 0,1.5 l 0,5 C 0,7.331 0.669,8 1.5,8 l 5,0 C 7.331,8 8,7.331 8,6.5 l 0,-5 C 8,0.669 7.331,0 6.5,0 l -5,0 z M 1,1 A 6,6 0 0 1 7,7 L 6,7 A 5,5 0 0 0 1,2 L 1,1 z M 1,3 A 4,4 0 0 1 5,7 L 4,7 A 3,3 0 0 0 1,4 L 1,3 z M 2,5 C 2.5522847,5 3,5.4477153 3,6 3,6.5522847 2.5522847,7 2,7 1.4477153,7 1,6.5522847 1,6 1,5.4477153 1.4477153,5 2,5 z"></path>
        <path class="foreground" fill="white"
              d="M 1,1 1,2 A 5,5 0 0 1 6,7 L 7,7 A 6,6 0 0 0 1,1 z M 1,3 1,4 A 3,3 0 0 1 4,7 L 5,7 A 4,4 0 0 0 1,3 z M 2,5 C 1.4477153,5 1,5.4477153 1,6 1,6.5522847 1.4477153,7 2,7 2.5522847,7 3,6.5522847 3,6 3,5.4477153 2.5522847,5 2,5 z"></path>
      </svg>
    </a>
  </body>
</html>
EOXML

doc = Nokogiri::HTML html

puts doc.errors

Output:

Tag svg invalid
Tag path invalid
Tag path invalid

(This HTML5 code is valid according to http://validator.w3.org/check)
(I'm using Nokogiri through HTML::Proofer for a Jekyll website)

@flavorjones
Copy link
Member

Nokogiri's underlying HTML parsers (libxml2 for CRuby, nekoHTML for JRuby) are HTML4 parsers, and so for some time there hasn't been much we can do to help with HTML5 other than to recommend that people use the Nokogumbo gem, which extends Nokogiri's API and provides an HTML5 parser.

I'm happy to let you know that #2204 is driving the merger of Nokogumbo and its HTML5 parser, and so Nokogiri v1.12 will support HTML5 once it is release. Please follow that issue for status updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants