Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReDOS when using a variable path and query during extract #364

Open
ebenoist opened this issue Oct 30, 2019 · 2 comments
Open

ReDOS when using a variable path and query during extract #364

ebenoist opened this issue Oct 30, 2019 · 2 comments

Comments

@ebenoist
Copy link

ebenoist commented Oct 30, 2019

Template Used

Addressable::Template.new("{scheme}://{host}{/path*}{?query*}")

When the template above extracts a url with an empty query string, the regex used displays immense performance degradation.

template.extract("http://host.com/abcdefghijklmnopqrstuvwxyz?a")

All the time spent (~12 seconds on a modern Macbook Pro) is spent attempting to match the uri with the generated expansion regex.

Bad Regex

/(?-mix:^(?:|(?<scheme>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?):\/\/(?:|(?<host>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\/(?<path>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:\/?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)(?:|\?(?<query>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:&?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)$)/

Reproduction

require "addressable"
require "benchmark"

puts "*" * 30
puts "path with expansion - {scheme}://{host}{/path*}{?query*}"
puts "*" * 30
template = Addressable::Template.new("{scheme}://{host}{/path*}{?query*}")

puts "with one part"
puts Benchmark.measure {
  puts template.extract("http://host.com/abcdefghijklmnopqrstuvwxyz")
}

puts "with one part and empty query - reDDOS"
puts Benchmark.measure {
  puts template.extract("http://host.com/abcdefghijklmnopqrstuvwxyz?a")
}

puts "expand with one part"
puts Benchmark.measure {
  puts template.expand({
    scheme: "https",
    host: "host.com",
    path: "abcdefghijklmnopqrstuvwxyz",
  })
}

puts "expand with one part and empty query"
puts Benchmark.measure {
  puts template.expand({
    scheme: "https",
    host: "host.com",
    path: "abcdefghijklmnopqrstuvwxyz",
    query: { a: nil },
  })
}

# ❥ ruby benchmark.rb
# With one part
#   0.000383   0.000027   0.000410 (  0.000407)
# With one part and empty query
#  12.595742   0.006677  12.602419 ( 12.612499)
# Expand with one part
#   0.000219   0.000003   0.000222 (  0.000223)
# Expand with one part and empty query
#   0.000177   0.000011   0.000188 (  0.000188)


puts "\n\n\n\n"
template = Addressable::Template.new("{scheme}://{host}{/path}{?query*}")
puts "*" * 30
puts "path without expansion - {scheme}://{host}{/path}{?query*}"
puts "*" * 30

puts "with one part"
puts Benchmark.measure {
  puts template.extract("http://host.com/abcdefghijklmnopqrstuvwxyz")
}

puts "with one part and empty query"
puts Benchmark.measure {
  puts template.extract("http://host.com/abcdefghijklmnopqrstuvwxyz?a")
}

puts "expand with one part"
puts Benchmark.measure {
  puts template.expand({
    scheme: "https",
    host: "host.com",
    path: "abcdefghijklmnopqrstuvwxyz",
  })
}

puts "expand with one part and empty query"
puts Benchmark.measure {
  puts template.expand({
    scheme: "https",
    host: "host.com",
    path: "abcdefghijklmnopqrstuvwxyz",
    query: { a: nil },
  })
}

# ❥ ruby benchmark.rb
# With one part
#   0.000383   0.000027   0.000410 (  0.000407)
# With one part and empty query
#  12.595742   0.006677  12.602419 ( 12.612499)
# Expand with one part
#   0.000219   0.000003   0.000222 (  0.000223)
# Expand with one part and empty query
#   0.000177   0.000011   0.000188 (  0.000188)


##
# When using the variable path the following regexp is generated
# EXPANSION_REGEXP_WITH_VARIABLE_PATH = /(?-mix:^(?:|(?<scheme>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?):\/\/(?:|(?<host>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\/(?<path>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:\/?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)(?:|\?(?<query>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:&?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)$)/
#
# When we don't use a variable path the following regexp is generated
# EXPANSION_REGEXP_WITHOUT_VARIABLE_PATH = /(?-mix:^(?:|(?<scheme>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?):\/\/(?:|(?<host>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\/(?<path>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\?(?<query>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:&?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)$)/

puts "\n\n\n\n"

EXPANSION_REGEXP_WITH_VARIABLE_PATH = /(?-mix:^(?:|(?<scheme>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?):\/\/(?:|(?<host>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\/(?<path>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:\/?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)(?:|\?(?<query>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:&?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)$)/
EXPANSION_REGEXP_WITHOUT_VARIABLE_PATH = /(?-mix:^(?:|(?<scheme>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?):\/\/(?:|(?<host>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\/(?<path>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)?)(?:|\?(?<query>(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?(?:&?(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*=(?:[a-zA-Z0-9\-\.\_\~]|%[a-fA-F0-9][a-fA-F0-9])*?)*)?)$)/

puts "bad regexp"
puts Benchmark.measure {
  puts "http://host.com/abcdefghijklmnopqrstuvwxyz?a".match(EXPANSION_REGEXP_WITH_VARIABLE_PATH)
}

puts "\n"

puts "better regexp"
puts Benchmark.measure {
  puts "http://host.com/abcdefghijklmnopqrstuvwxyz?a".match(EXPANSION_REGEXP_WITHOUT_VARIABLE_PATH)
}

# bad regexp
#
#  12.920860   0.016003  12.936863 ( 12.948862)
#
# better regexp
#
#   0.000008   0.000002   0.000010 (  0.000010)

Thank you @joshkrueger for helping isolate the issue.

@ebenoist ebenoist changed the title ReDDOS when using a variable path and query during extract ReDOS when using a variable path and query during extract Nov 2, 2019
@olofheurgren
Copy link

olofheurgren commented Jul 7, 2021

I noticed the same thing when extracting a URL with a fragment (#) part where the Template does not include a fragment. Execution time seems to grow exponentially as the path length increases.

EDIT: I saw in CHANGELOG.md that version 2.8.0 fixes a ReDoS vulnerability. However, this issue still occurs on 2.8.0.

Demonstration:

require 'addressable/template'
require 'benchmark'

template = Addressable::Template.new('http://www.example.com{/path*}')

(15..30).each do |path_length|
  path = "a"*path_length
  extract_time = Benchmark.realtime do
    template.extract("http://www.example.com/#{path}#fragment")
  end 
  puts "path length: #{path_length}, extract time: #{'%.2f' % extract_time}s"
end

output:

path length: 15, extract time: 0.01s
path length: 16, extract time: 0.01s
path length: 17, extract time: 0.02s
path length: 18, extract time: 0.03s
path length: 19, extract time: 0.07s
path length: 20, extract time: 0.13s
path length: 21, extract time: 0.26s
path length: 22, extract time: 0.52s
path length: 23, extract time: 1.03s
path length: 24, extract time: 2.07s
path length: 25, extract time: 4.15s
path length: 26, extract time: 8.34s
path length: 27, extract time: 16.49s
path length: 28, extract time: 33.17s
path length: 29, extract time: 75.10s
path length: 30, extract time: 142.61s

@ebenoist
Copy link
Author

Yup, saw the CVE reported today and figured it was this issue. We've removed the need to pass untrusted URIs to this library, but I can also confirm that the issue still exists.

Repository owner deleted a comment from stehorne6 Feb 15, 2023
Repository owner deleted a comment from stehorne6 Feb 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants