Skip to content
This repository has been archived by the owner on Aug 26, 2023. It is now read-only.

Latest commit

 

History

History
75 lines (67 loc) · 3.7 KB

CHANGELOG.md

File metadata and controls

75 lines (67 loc) · 3.7 KB

Changelog

All notable changes to Nokogumbo will be documented in this file.

The format is based on Keep a Changelog and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Experimental support for errors (it was supported in 1.5.0 but undocumented).
  • Added proper HTML5 serialization.
  • Added option :max_errors to control the maximum number of errors reported by #errors
  • Added option :max_tree_depth to control the maximum parse tree depth.
  • Line number support via Nokogiri::XML::Node#line as long as Nokogumbo has been compiled with libxml2 support.

Changed

  • Integrated Gumbo parser into Nokogumbo. A system version will not be used.
  • The undocumented (but publicly mentioned) :max_parse_errors renamed to :max_errors; :max_parse_errors is deprecated and will go away
  • The various #parse and #fragment (and Nokogiri.HTML5) methods return Nokogiri::HTML5::Document and Nokogiri::HTML5::DocumentFragment classes rather than Nokogiri::HTML::Document and Nokogiri::HTML::DocumentFragment.
  • Changed the top-level API to more closely match Nokogiri's while maintaining backwards compatibility. The new APIs are
    • Nokogiri::HTML5(html, url = nil, encoding = nil, **options, &block)
    • Nokogiri::HTML5.parse(html, url = nil, encoding = nil, **options, &block)
    • Nokogiri::HTML5::Document.parse(html, url = nil, encoding = nil, **options, &block)
    • Nokogiri::HTML5.fragment(html, encoding = nil, **options)
    • Nokogiri::HTML5::DocumentFragment.parse(html, encoding = nil, **options)
    • Nokogiri::HTML5::DocumentFragment.new(document, html = nil, ctx = nil)
    • Nokogiri::HTML5::Document#fragment(html = nil)
    • Nokogiri::XML::Node#fragment(html = nil) In all cases, html can be a string or an IO object (something that responds to #read). The url parameter is entirely for error reporting, as in Nokogiri. The encoding parameter only signals what encoding html should have on input; the output Document or DocumentFragment will be in UTF-8. Currently, the only options supported are :max_errors which controls the maximum number of reported by #errors.
  • Minimum supported version of Ruby changed to 2.1.
  • Minimum supported version of Nokogiri changed to 1.8.0.
  • Nokogiri::HTML5::DocumentFragment#errors returns errors for the document fragment itself, not the underlying document.
  • The five XML namespaces described in the HTML spec, MathML, SVG, XLink, XML, and XMLNS, are now supported. Thus <svg> will create an svg element in the SVG namespace and <math> will create a math element in the MathML namespace. An attribute xml:lang=en, for example, will create a lang attribute in the XML namespace, but only in foreign elements (i.e., those in the SVG or MathML namespaces). On HTML elements, this creates an attribute with the name xml:lang. This changes the #xpath and related APIs.

Deprecated

  • :max_parse_errors; use :max_errors

Removed

Fixed

  • Fixed documents failing to serialize (via to_html) if they contain certain meta elements that set the charset.
  • Documents are now properly marked as UTF-8 after parsing.
  • Fixed Nokogiri::HTML5.fragment reporting an error due to a missing <!DOCTYPE html>.
  • Fixed crash when input contains U+0000 NULL bytes and error reporting is enabled.

Security

  • The most recent, released version of Gumbo has a potential security issue that could result in a cross-site scripting vulnerability. This has been fixed by integrating Gumbo into Nokogumbo.