Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check Encodings before calling force_encoding in Addressable::URI #341

Merged
merged 7 commits into from Aug 18, 2022
40 changes: 20 additions & 20 deletions lib/addressable/uri.rb
Expand Up @@ -899,7 +899,7 @@ def normalized_scheme
end
end
# All normalized values should be UTF-8
@normalized_scheme.force_encoding(Encoding::UTF_8) if @normalized_scheme
force_utf8_encoding_if_needed(@normalized_scheme)
@normalized_scheme
end

Expand Down Expand Up @@ -954,7 +954,7 @@ def normalized_user
end
end
# All normalized values should be UTF-8
@normalized_user.force_encoding(Encoding::UTF_8) if @normalized_user
force_utf8_encoding_if_needed(@normalized_user)
@normalized_user
end

Expand Down Expand Up @@ -1011,9 +1011,7 @@ def normalized_password
end
end
# All normalized values should be UTF-8
if @normalized_password
@normalized_password.force_encoding(Encoding::UTF_8)
end
force_utf8_encoding_if_needed(@normalized_password)
@normalized_password
end

Expand Down Expand Up @@ -1081,9 +1079,7 @@ def normalized_userinfo
end
end
# All normalized values should be UTF-8
if @normalized_userinfo
@normalized_userinfo.force_encoding(Encoding::UTF_8)
end
force_utf8_encoding_if_needed(@normalized_userinfo)
@normalized_userinfo
end

Expand Down Expand Up @@ -1150,9 +1146,7 @@ def normalized_host
end
end
# All normalized values should be UTF-8
if @normalized_host && !@normalized_host.empty?
@normalized_host.force_encoding(Encoding::UTF_8)
end
force_utf8_encoding_if_needed(@normalized_host)
@normalized_host
end

Expand Down Expand Up @@ -1270,9 +1264,7 @@ def normalized_authority
authority
end
# All normalized values should be UTF-8
if @normalized_authority
@normalized_authority.force_encoding(Encoding::UTF_8)
end
force_utf8_encoding_if_needed(@normalized_authority)
@normalized_authority
end

Expand Down Expand Up @@ -1506,7 +1498,7 @@ def normalized_site
site_string
end
# All normalized values should be UTF-8
@normalized_site.force_encoding(Encoding::UTF_8) if @normalized_site
force_utf8_encoding_if_needed(@normalized_site)
@normalized_site
end

Expand Down Expand Up @@ -1569,7 +1561,7 @@ def normalized_path
result
end
# All normalized values should be UTF-8
@normalized_path.force_encoding(Encoding::UTF_8) if @normalized_path
force_utf8_encoding_if_needed(@normalized_path)
@normalized_path
end

Expand Down Expand Up @@ -1645,7 +1637,7 @@ def normalized_query(*flags)
component == "" ? nil : component
end
# All normalized values should be UTF-8
@normalized_query.force_encoding(Encoding::UTF_8) if @normalized_query
force_utf8_encoding_if_needed(@normalized_query)
@normalized_query
end

Expand Down Expand Up @@ -1841,9 +1833,7 @@ def normalized_fragment
component == "" ? nil : component
end
# All normalized values should be UTF-8
if @normalized_fragment
@normalized_fragment.force_encoding(Encoding::UTF_8)
end
force_utf8_encoding_if_needed(@normalized_fragment)
@normalized_fragment
end

Expand Down Expand Up @@ -2556,5 +2546,15 @@ def remove_composite_values
remove_instance_variable(:@uri_string) if defined?(@uri_string)
remove_instance_variable(:@hash) if defined?(@hash)
end

##
# Converts the string to be UTF-8 if it is not already UTF-8
#
# @api private
def force_utf8_encoding_if_needed(str)
if str && str.encoding != Encoding::UTF_8
str.force_encoding(Encoding::UTF_8)
end
end
end
end
68 changes: 68 additions & 0 deletions spec/addressable/uri_spec.rb
Expand Up @@ -998,6 +998,74 @@ def to_s
end
end

describe Addressable::URI, "when normalized and then deeply frozen" do
dentarg marked this conversation as resolved.
Show resolved Hide resolved
dentarg marked this conversation as resolved.
Show resolved Hide resolved
sporkmonger marked this conversation as resolved.
Show resolved Hide resolved
sporkmonger marked this conversation as resolved.
Show resolved Hide resolved
sporkmonger marked this conversation as resolved.
Show resolved Hide resolved
sporkmonger marked this conversation as resolved.
Show resolved Hide resolved
before do
@uri = Addressable::URI.parse(
"http://user:password@example.com:8080/path?query=value#fragment"
).normalize!

@uri.instance_variables.each do |var|
@uri.instance_variable_set(var, @uri.instance_variable_get(var).freeze)
end

@uri.freeze
end

it "#normalized_scheme should not error" do
expect { @uri.normalized_scheme }.not_to raise_error
end

it "#normalized_user should not error" do
expect { @uri.normalized_user }.not_to raise_error
end

it "#normalized_password should not error" do
expect { @uri.normalized_password }.not_to raise_error
end

it "#normalized_userinfo should not error" do
expect { @uri.normalized_userinfo }.not_to raise_error
end

it "#normalized_host should not error" do
expect { @uri.normalized_host }.not_to raise_error
end

it "#normalized_authority should not error" do
expect { @uri.normalized_authority }.not_to raise_error
end

it "#normalized_port should not error" do
expect { @uri.normalized_port }.not_to raise_error
end

it "#normalized_site should not error" do
expect { @uri.normalized_site }.not_to raise_error
end

it "#normalized_path should not error" do
expect { @uri.normalized_path }.not_to raise_error
end

it "#normalized_query should not error" do
expect { @uri.normalized_query }.not_to raise_error
end

it "#normalized_fragment should not error" do
expect { @uri.normalized_fragment }.not_to raise_error
end

it "should be frozen" do
expect(@uri).to be_frozen
end

it "should not allow destructive operations" do
ruby_check = Gem::Version.new(RUBY_VERSION) < Gem::Version.new("2.5.0")
error = ruby_check ? RuntimeError : FrozenError
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to have this test not care at all which error type is raised.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I'm following... it says should not allow destructive operations and we expect @uri.normalize! to raise RuntimeError/FrozenError (can't modify frozen String). Is that because you shouldn't be able to use #normalize! on a frozen Addressable::URI? Maybe we can make that even more clear in the test? (The @uri.freeze happens way up in the before block so it is easy to miss)

Maybe Addressable::URI#normalize! should detect that the URI is frozen and raise an error defined by Addressable? (FrozenURIError perhaps?)


Ah, this particular spec doesn't test the changes made in this branch, I ran this against Addressable 2.8.0

# Ruby 2.4.3

irb(main):005:0* @uri = Addressable::URI.parse("http://user:password@example.com:8080/path?query=value#fragment")
=> #<Addressable::URI:0x3ff6e204dd78 URI:http://user:password@example.com:8080/path?query=value#fragment>
irb(main):006:0> @uri.freeze
=> #<Addressable::URI:0x3ff6e204dd78 URI:http://user:password@example.com:8080/path?query=value#fragment>
irb(main):007:0> @uri.normalize!
RuntimeError: can't modify frozen Addressable::URI
	from /Users/dentarg/.gem/ruby/2.4.3/gems/addressable-2.8.0/lib/addressable/uri.rb:2518:in `remove_instance_variable'
	from /Users/dentarg/.gem/ruby/2.4.3/gems/addressable-2.8.0/lib/addressable/uri.rb:2518:in `block in replace_self'
	from /Users/dentarg/.gem/ruby/2.4.3/gems/addressable-2.8.0/lib/addressable/uri.rb:2516:in `each'
	from /Users/dentarg/.gem/ruby/2.4.3/gems/addressable-2.8.0/lib/addressable/uri.rb:2516:in `replace_self'
	from /Users/dentarg/.gem/ruby/2.4.3/gems/addressable-2.8.0/lib/addressable/uri.rb:2211:in `normalize!'
	from (irb):7
	from /Users/dentarg/.rubies/2.4.3/bin/irb:11:in `<main>'
# Ruby 3.1.2

irb(main):005:0> @uri = Addressable::URI.parse("http://user:password@example.com:8080/path?query=value#fragment")
=> #<Addressable::URI:0xb43c URI:http://user:password@example.com:8080/path?query=value#fragment>
irb(main):006:0>
irb(main):007:0> @uri.freeze
=> #<Addressable::URI:0xb43c URI:http://user:password@example.com:8080/path?query=value#fragment>
irb(main):008:0> @uri.normalize!
/Users/dentarg/.gem/ruby/3.1.2/gems/addressable-2.8.0/lib/addressable/uri.rb:2518:in `remove_instance_variable': can't modify frozen Addressable::URI: #<Addressable::URI:0xb43c URI:http://user:password@example.com:8080/path?query=value#fragment> (FrozenError)
	from /Users/dentarg/.gem/ruby/3.1.2/gems/addressable-2.8.0/lib/addressable/uri.rb:2518:in `block in replace_self'
	from /Users/dentarg/.gem/ruby/3.1.2/gems/addressable-2.8.0/lib/addressable/uri.rb:2516:in `each'
	from /Users/dentarg/.gem/ruby/3.1.2/gems/addressable-2.8.0/lib/addressable/uri.rb:2516:in `replace_self'
	from /Users/dentarg/.gem/ruby/3.1.2/gems/addressable-2.8.0/lib/addressable/uri.rb:2211:in `normalize!'
	from (irb):8:in `<main>'
	from /Users/dentarg/.rubies/3.1.2/lib/ruby/gems/3.1.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>'
	from /Users/dentarg/.rubies/3.1.2/bin/irb:25:in `load'
	from /Users/dentarg/.rubies/3.1.2/bin/irb:25:in `<main>'

Perhaps we shouldn't change this behaviour as I suggested above


@sporkmonger should we just check error message? Expect can't modify frozen Addressable::URI. Those can change from Ruby to Ruby version too, but probably unlikely in this case IMHO.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly I'd prefer to have the test make the same checks regardless of Ruby version. If the error message is consistent between versions but the type is not, it seems fine to check the error message, but if that's inconsistent too, I think it's fine to verify that an error, any error, has been thrown.

I think I'd rather not do a custom error type because I think if we did a custom type it should probably inherit from FrozenError on versions of Ruby that have it, and that just shifts the problem out of the tests and into the library itself. I'm not sure that's an improvement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can have the test make the same checks regardless of Ruby version. If we look at the hierarchy of exceptions at https://ruby-doc.org/core-2.6.5/Exception.html we can see that FrozenError is a subclass of RuntimeError. That should never change really. We can use #is_a? and always check if it is a RuntimeError:

irb(main):007:1* ex = begin
irb(main):008:1*   raise FrozenError, "can't modify frozen String"
irb(main):009:1* rescue => e
irb(main):010:1*   e
irb(main):011:0> end
=> #<FrozenError: can't modify frozen String>
irb(main):012:0> ex.inspect
=> "#<FrozenError: can't modify frozen String>"
irb(main):013:0> ex.is_a?(RuntimeError)
=> true

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(We don't need to check the message)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other changes required? I've implemented the above changes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good now, thanks!

expect { @uri.normalize! }.to raise_error(error)
end
end

describe Addressable::URI, "when created from string components" do
before do
@uri = Addressable::URI.new(
Expand Down