Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsing HTML5 with <time> element generates error #800

Closed
gkellogg opened this issue Nov 29, 2012 · 1 comment
Closed

Parsing HTML5 with <time> element generates error #800

gkellogg opened this issue Nov 29, 2012 · 1 comment

Comments

@gkellogg
Copy link

Running Nokogiri 1.5.5 on Ruby 1.9.3p327 try the following:

Nokogiri::HTML.parse(%{<time datetime="2011-11-28"></time>}).errors

It generates the following:

[#<Nokogiri::XML::SyntaxError: Tag time invalid>] 

This worked in a previous version. The time element is valid, although at one time it was removed. Any variation using fails.

@flavorjones
Copy link
Member

Nokogiri's underlying HTML parsers (libxml2 for CRuby, nekoHTML for JRuby) are HTML4 parsers, and so for some time there hasn't been much we can do to help with HTML5 other than to recommend that people use the Nokogumbo gem, which extends Nokogiri's API and provides an HTML5 parser.

I'm happy to let you know that #2204 is driving the merger of Nokogumbo and its HTML5 parser, and so Nokogiri v1.12 will support HTML5 once it is release. Please follow that issue for status updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants