Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix AnyUrl.build doesn't do percent encoding (#3061) #4224

Merged
merged 10 commits into from Aug 19, 2022
Merged

Fix AnyUrl.build doesn't do percent encoding (#3061) #4224

merged 10 commits into from Aug 19, 2022

Conversation

faresbakhit
Copy link
Contributor

@faresbakhit faresbakhit commented Jul 10, 2022

Change Summary

  • Add percent_encode function to pydantic.utils
  • Add tests for percent_encode
  • Fix AnyUrl.build
  • Add new tests for AnyUrl.build

Related issue number

fix #3061

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable, not applicable
  • changes/<pull request or issue id>-<github username>.md file added describing change
  • My PR is ready to review

@faresbakhit
Copy link
Contributor Author

please review

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good in principle, can we get some benchmarks to see how much this slows down validation

pydantic/utils.py Outdated Show resolved Hide resolved
pydantic/utils.py Outdated Show resolved Hide resolved
@samuelcolvin
Copy link
Member

please update.

@faresbakhit
Copy link
Contributor Author

Looks good in principle, can we get some benchmarks to see how much this slows down validation

I'm not sure how this affects validation or how to benchmark pydantic's validation, can you assist me on this?

@samuelcolvin
Copy link
Member

Thanks so much for the benchmarks, I think we should go with from urllib.parse import quote_plus for the following reasons:

  • keep it simple, minimise our own logic
  • I would guess most URLs will be short enough that's quote_plus will be marginally quicker

I'm not sure how this affects validation or how to benchmark pydantic's validation, can you assist me on this?

I think we can skip this, we have to get this right, so there's no option of not quote encoding.

The real question long terms is whether we should move this logic to rust, but provided the API doesn't change we could do that in a minor release after V2.

@faresbakhit
Copy link
Contributor Author

faresbakhit commented Aug 13, 2022

Okay, I removed pydantic.utils.percent_encode in the latest commit.

The real question long terms is whether we should move this logic to rust, but provided the API doesn't change we could do that in a minor release after V2.

I think this should be moved to rust since urllib.parse is just pure python.

I will drop some benchmarks later for urllib.prase.quote against percent-encoding and urlencoding

Edit:

Here's the benchmark, code:

Python vs Rust percent encoding benchmark

This is a 12%ish increase

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, my main concerns are:

  • this could in theory be considered a breaking change, I think it's fine because it's pretty clear than without this the built URLs are "wrong", @PrettyWood do you agree?
  • there's no way to set quote_plus on a URL field, other than stricturl - maybe that's fine?

please can you add to the docs:

  • An example of this in action
  • an example with stricturl of using quote_plus=False (or whatever the non-default is)
  • a "note" admonition saying "percent encoding was added in V1.10 ..."

Please update.

@@ -153,6 +154,7 @@ def __init__(
path: Optional[str] = None,
query: Optional[str] = None,
fragment: Optional[str] = None,
plus: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
plus: bool = False,
quote_plus: bool = False,

might be a clearer name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a good reason not to have quote_plus = True as the default?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add quote_plus to stricturl

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a good reason not to have quote_plus = True as the default?

Just conforming to the standard (RFC 3986)

changes/3061-FaresAhmedb.md Outdated Show resolved Hide resolved
pydantic/networks.py Outdated Show resolved Hide resolved
@samuelcolvin
Copy link
Member

Also, @FaresAhmedb please remember to use the magic comment "please review" (with one space) to request a review, otherwise we might miss updates on the PR.

Fares Ahmed and others added 3 commits August 15, 2022 12:35
@faresbakhit
Copy link
Contributor Author

please review

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, just a few small things here, but you need to fix linting in the docs so build succeeds and we can check the docs look right.

Please update.

docs/usage/types.md Outdated Show resolved Hide resolved
docs/usage/types.md Outdated Show resolved Hide resolved
faresbakhit and others added 2 commits August 17, 2022 04:13
Co-authored-by: Samuel Colvin <samcolvin@gmail.com>
Co-authored-by: Samuel Colvin <samcolvin@gmail.com>
@faresbakhit
Copy link
Contributor Author

please review

@samuelcolvin samuelcolvin enabled auto-merge (squash) August 19, 2022 09:02
@samuelcolvin
Copy link
Member

Thanks so much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

*Url.build functions doesn't percent-encode arguments
4 participants