Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unclosed parenthesis in URL causes infinite loop #290

Closed
p-m-k opened this issue Apr 26, 2017 · 1 comment · Fixed by #323
Closed

Unclosed parenthesis in URL causes infinite loop #290

p-m-k opened this issue Apr 26, 2017 · 1 comment · Fixed by #323

Comments

@p-m-k
Copy link

p-m-k commented Apr 26, 2017

When there is an unclosed parenthesis in URL and we use url validator, it causes an infinite loop. What's more interesting is that it only happens when the unclosed parenthesis is followed by many characters (check test case number 3 and 4).

from colander import MappingSchema, SchemaNode, Str, url


class MySchema(MappingSchema):
    url = SchemaNode(Str(encoding='utf-8'), validator=url)

print MySchema().deserialize({"url": "http://www.mysite.com/tttttttttttttttttttttt.jpg"})  # it works
print MySchema().deserialize({"url": "http://www.mysite.com/(tttttttttttttttttttttt).jpg"})  # it works
print MySchema().deserialize({"url": "http://www.mysite.com/(ttttttttttt.jpg"})  # it works
print MySchema().deserialize({"url": "http://www.mysite.com/(tttttttttttttttttttttt.jpg"})  # infinite loop

In addition, if you check it in an online regex checker (https://regex101.com/) it also fails. Try this regex, it's used for URL validation in colander. It's taken from colander.__init__.py:438, I only escaped two slashes here.

(?i)\b((?:[a-z][\w-]+:(?:\/{1,3}|[a-z0-9%])|www\d{0,3}[.]|[a-z0-9.\-]+[.][a-z]{2,4}\/)(?:[^\s()<>]+|\(([^\s()<>]+|(\([^\s()<>]+\)))*\))+(?:\(([^\s()<>]+|(\([^\s()<>]+\)))*\)|[^\s`!()\[\]{};:'".,<>?«»“”‘’]))

Use this URL: http://www.mysite.com/(tttttttttttttttttttttt.jpg and you'll get catastrophic backtracking. You can use debugger on that site to check which group falls in infinite loop.

@mmerickel
Copy link
Member

mmerickel commented Apr 28, 2017

We should switch this to just use urlparse and confirm that the required parts exist such as scheme/host/path.

Anyone have some better advice?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants