New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bugfix/anyurl escapes percent sign in url #4461
Bugfix/anyurl escapes percent sign in url #4461
Conversation
Please add tests for your change into test_build_url_quote_plus |
Thanks @vetedde for this patch 👍 |
Co-authored-by: Hasan Ramezani <hasan.r67@gmail.com>
@@ -396,6 +396,10 @@ def apply_default_parts(cls, parts: 'Parts') -> 'Parts': | |||
|
|||
@classmethod | |||
def quote(cls, string: str, safe: str = '') -> str: | |||
pattern = r'^([\w]+|(%\d{2}))+$' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please compile the pattern and then use the match
on the compiled pattern.
Take a look at the same implementation in datetime_parse.py
pydantic/pydantic/datetime_parse.py
Line 30 in 32ea885
date_re = re.compile(f'{date_expr}$') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a comment explaining what this regex is doing?
I also think it could be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regex checks that string contains only %xx or A-Za-z0-9 and _ and nothing else.
Pattern work with cases: http://regexr.com/6t6g6
@hramezani found a problem with case "foo%20bar", when quote_plus is True. My solution returns foo%20bar, but need foo+bar. I'm thinking of adding an extra check, but it's nested check and still have to unquote at some point string_with_precent_code_re = re.compile(r'^([\w]+|(%\d{2}))+$')
if string_with_precent_code_re.match(string) is not None:
if cls.quote_plus:
string = unquote_plus(string)
else:
return string
return quote_plus(string, safe) if cls.quote_plus else quote(string, safe) Maybe it's better to do it at once: string = unquote_plus(string, safe) if cls.quote_plus else unquote(string, safe)
return quote_plus(string, safe) if cls.quote_plus else quote(string, safe) or always do unquote if match %xx What do you think? |
@vetedde I think there are 3 possible solutions:
What do you think @samuelcolvin |
please have a look at #4469, I think it's a cleaner approach to solve this. I've also added a large number of test cases to at least lock the behaviour. I'll reply on the original issue with some more context. Thanks so much @vetedde for looking into this, I'll leave this PR open in case it is decided that it's a better approach than #4469. |
I think we still will have the problem reported in #4468 even with this patch. which is a breaking change. |
Change Summary
Fix re-quote when build url. I added relevant cases to test.
Also fix mypy error, because without it I couldn't create commit
Related issue number
Fix #4458
Checklist
changes/<pull request or issue id>-<github username>.md
file added describing change(see changes/README.md for details.
You can skip this check if the change does not need a change file.)