Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

attribute.value = attribute.value changes output #1800

Closed
bananarne opened this issue Sep 20, 2018 · 8 comments
Closed

attribute.value = attribute.value changes output #1800

bananarne opened this issue Sep 20, 2018 · 8 comments
Assignees
Milestone

Comments

@bananarne
Copy link

bananarne commented Sep 20, 2018

What problems are you experiencing?
doing attribute.value = attribute.value makes it "boolean"/"minimized"

What's the output from nokogiri -v?

Nokogiri (1.8.4)

---
warnings: []
nokogiri: 1.8.4
ruby:
  version: 2.4.4
  platform: x86_64-linux
  description: ruby 2.4.4p296 (2018-03-28 revision 63013) [x86_64-linux]
  engine: ruby
libxml:
  binding: extension
  source: packaged
  libxml2_path: "/usr/local/lib64/ruby/gems/2.4.0/gems/nokogiri-1.8.4/ports/x86_64-pc-linux-gnu/libxml2/2.9.8"
  libxslt_path: "/usr/local/lib64/ruby/gems/2.4.0/gems/nokogiri-1.8.4/ports/x86_64-pc-linux-gnu/libxslt/1.1.32"
  libxml2_patches:
  - 0001-Revert-Do-not-URI-escape-in-server-side-includes.patch
  libxslt_patches: []
  compiled: 2.9.8
  loaded: 2.9.8

Can you provide a self-contained script that reproduces what you're seeing?

doc = Nokogiri::HTML('<!doctype html><html><body><img alt=""></body></html>')
doc.xpath('//*/@*').each { |a| a.value = a.value } # this is the code that changes something while it shouldn't?
doc.to_html
@flavorjones flavorjones added this to the next milestone Dec 1, 2018
@flavorjones
Copy link
Member

Thanks for reporting this! Certainly weird behavior. I dug in a bit and discovered this doesn't happen if we assign to the attribute via #[]=:

doc = Nokogiri::HTML('<!doctype html><html><body><div alt="x" foo="x"></div></body></html>')
puts doc.to_html
# => <html><body><div alt="x" foo="x"></div></body></html>

doc.at_css("div")["alt"] = ""
doc.at_css("div")["foo"] = ""

puts doc.to_html
# => <html><body><div alt="" foo=""></div></body></html>

but does happen if we assign an empty string to the attribute value:

doc = Nokogiri::HTML('<!doctype html><html><body><div alt="x" foo="x"></div></body></html>')
puts doc.to_html
# => <html><body><div alt="x" foo="x"></div></body></html>

doc.xpath('//*/@*').each do |a|
  a.value = ""
end

puts doc.to_html
# => <html><body><div alt foo></div></body></html>

This appears to narrow down the behavior to the #value= method on Xml::Attr, so I'll go look there.

@flavorjones flavorjones self-assigned this Dec 2, 2018
flavorjones added a commit that referenced this issue Dec 2, 2018
previously was rendered as a boolean attribute. also allow for boolean
attributes by setting value to `nil`.

Fixes #1800.
@flavorjones
Copy link
Member

PR is in #1827, going through CI now.

@flavorjones
Copy link
Member

I'll note that that PR works like this:

#! /usr/bin/env ruby

require 'nokogiri'

doc = Nokogiri::HTML('<!doctype html><html><body><div bar="x" foo="x"></div></body></html>')
puts doc.to_html
# => <html><body><div bar="x" foo="x"></div></body></html>

doc.at_css("div").tap do |node|
  node.attributes['bar'].value = nil
  node.attributes['foo'].value = ""
end

puts doc.to_html
# => <html><body><div bar foo=""></div></body></html>

@flavorjones
Copy link
Member

Will be fixed in the next release, hopefully in the next few days.

@bananarne
Copy link
Author

Wow, thanks!

@brendon
Copy link

brendon commented Aug 5, 2019

Sorry to slightly resurrect this. Sometimes it's desirable to add a new attribute that has no value. The only way I can see to do this is:

node['bar'] = 'anything'
node.attributes['bar'].value = nil

Is there a better way?

@flavorjones
Copy link
Member

@brendon The second way you mention is what Nokogiri currently supports:

node.attributes['bar'].value = nil

If you're thinking that's not a great syntax to have to type, I agree with you, but it's what we're stuck with now for fear of introducing a backwards-incompatible change to the API.

FWIW, if this is something you do often, and you're up for it, I'd be open to a PR proposing a new method to do this. Something like Node#add_empty_attribute seems like a reasonable name.

@brendon
Copy link

brendon commented Sep 29, 2019

Hi @flavorjones, that sounds like a great plan. Thank you for being open to that possibility.

EDIT: I've not had the time to pursue this. I've only had to do it once and no one else seems to have had the need so I'll leave it for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants