Skip to content

Commit

Permalink
Merge pull request #2087 from ashmaroli/reduce-allocations-from-hack
Browse files Browse the repository at this point in the history
Reduce allocations from hack in document_fragment
  • Loading branch information
flavorjones committed Feb 6, 2021
2 parents d5c053a + be2be40 commit 01d4eaf
Show file tree
Hide file tree
Showing 2 changed files with 16 additions and 15 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Expand Up @@ -15,6 +15,7 @@ Nokogiri follows [Semantic Versioning](https://semver.org/), please see the [REA

### Improved

* Reduce the number of object allocations needed when parsing an HTML::DocumentFragment. [[#2087](https://github.com/sparklemotion/nokogiri/issues/2087)] (Thanks, [@ashmaroli](https://github.com/ashmaroli)!)
* [JRuby] Update the algorithm used to calculate `Node#line` to be wrong less-often. The underlying parser, Xerces, does not track line numbers, and so we've always used a hacky solution for this method. [[#1223](https://github.com/sparklemotion/nokogiri/issues/1223)]


Expand Down
30 changes: 15 additions & 15 deletions lib/nokogiri/html/document_fragment.rb
Expand Up @@ -4,26 +4,26 @@ module HTML
class DocumentFragment < Nokogiri::XML::DocumentFragment
####
# Create a Nokogiri::XML::DocumentFragment from +tags+, using +encoding+
def self.parse tags, encoding = nil
def self.parse(tags, encoding = nil)
doc = HTML::Document.new

encoding ||= if tags.respond_to?(:encoding)
encoding = tags.encoding
if encoding == ::Encoding::ASCII_8BIT
'UTF-8'
else
encoding.name
end
else
'UTF-8'
end
encoding = tags.encoding
if encoding == ::Encoding::ASCII_8BIT
'UTF-8'
else
encoding.name
end
else
'UTF-8'
end

doc.encoding = encoding

new(doc, tags)
end

def initialize document, tags = nil, ctx = nil
def initialize(document, tags = nil, ctx = nil)
return self unless tags

if ctx
Expand All @@ -33,13 +33,13 @@ def initialize document, tags = nil, ctx = nil
self.errors = document.errors - preexisting_errors
else
# This is a horrible hack, but I don't care
if tags.strip =~ /^<body/i
path = "/html/body"
path = if /^\s*?<body/i.match?(tags)
"/html/body"
else
path = "/html/body/node()"
"/html/body/node()"
end

temp_doc = HTML::Document.parse "<html><body>#{tags}", nil, document.encoding
temp_doc = HTML::Document.parse("<html><body>#{tags}", nil, document.encoding)
temp_doc.xpath(path).each { |child| child.parent = self }
self.errors = temp_doc.errors
end
Expand Down

0 comments on commit 01d4eaf

Please sign in to comment.