Fix url parsing of ipv6 urls on `URL.replace` #1965

kousikmitra · 2022-11-25T15:17:45Z

The starting point for contributions should usually be a discussion

Simple documentation typos may be raised as stand-alone pull requests, but otherwise please ensure you've discussed your proposal prior to issuing a pull request.

This will help us direct work appropriately, and ensure that any suggested changes have been okayed by the maintainers.

Initially raised as discussion #...

dimaqq · 2022-11-28T23:56:52Z

starlette/datastructures.py

+            if not hostname:
+                netloc = self.netloc
+                _, _, hostname = netloc.rpartition("@")
+
+                if hostname[-1] != "]":
+                    hostname = hostname.rsplit(":", 1)[0]


I guess the new approach breaks the existing undocumented, untested functionality...

May I propose a test like this:

assert URL("http://u:p@host/").replace(hostname="bar") == URL("http://u:p@bar/")

Or maybe like this:

assert URL("http://u:p@host:80").replace(port=88) == URL("http://u:p@host:88")

added, these are passing

dimaqq

Would it be better to use urlunparse or urlunsplit from python stdlib instead of manual string concatention?

kousikmitra · 2022-11-29T20:13:22Z

Would it be better to use urlunparse or urlunsplit from python stdlib instead of manual string concatention?

@dimaqq not sure how you wanted to use urlunsplit here! urlunsplit won't join netloc fields. can you explain more?

dimaqq · 2022-12-05T01:32:16Z

@kousikmitra
Merge branch 'master' into fix/ipv6-url-bug

Instead of the above, I would generally recommend to rebase your branch on top of master and then force-push your branch.
I don't know what git flow this particular project uses though.

adriangb · 2022-12-05T02:24:55Z

We squash merge into master, so it's not really important how the branch is kept up to date with master or how the commits on the branch are organized. Of course a nice linear history with well described atomic commits is a huge plus as a reviewer, but we certainly don't require it.

kousikmitra · 2022-12-05T17:37:34Z

@kousikmitra
Merge branch 'master' into fix/ipv6-url-bug

Instead of the above, I would generally recommend to rebase your branch on top of master and then force-push your branch. I don't know what git flow this particular project uses though.

I prefer the same, my bad I didn't notice github used merge method to update branch.

dimaqq · 2022-12-06T07:18:56Z

my 2c: it's best to disable GitHub's magic "update the branch" in repo settings.
anyway I'm not the maintainer, I don't get to make these decisions.

dimaqq

Looks good and solves the problem that I had.

All of these work fine:

In [8]: URL("http://[1::fe]/123").replace(port=88, username="uu", password="pp").replace(port=None, username=None, password=None)
Out[8]: URL('http://[1::fe]/123')

There's still one gotcha, but I suspect this also didn't really work before either, the IPv6 hostname needs to be bracketed, which is a bit unexpected:

In [14]: URL("http://foo.com/123").replace(hostname="fe::1")
Out[14]: URL('http://fe::1/123')

In [15]: URL("http://foo.com/123").replace(hostname="[fe::1]")
Out[15]: URL('http://[fe::1]/123')

Which also leads, in one case to an exception when using IPv6 shorthands:

In [18]: URL("http://foo.com/123").replace(hostname="1::23").replace(port=88)
ValueError: Port could not be cast to integer value as ':23'

This path triggered same error before.

So, strictly speaking, this PR fixes #1931 and doesn't introduce a regression, I'd say let's merge it!

dimaqq · 2022-12-07T05:01:09Z

P.S. by comparison, other libraries take care of bracketing ipv6 addresses themselves, they don't ask the caller to do it:

In [7]: yarl.URL("http://foo.com/1").with_host("123::fe")
Out[7]: URL('http://[123::fe]/1')

kousikmitra · 2022-12-07T06:13:41Z

There's still one gotcha, but I suspect this also didn't really work before either, the IPv6 hostname needs to be bracketed, which is a bit unexpected:

Which also leads, in one case to an exception when using IPv6 shorthands:

To tackle this we should not be relying on urlsplit hostname instead we can infer hostname, port from netloc.

dimaqq · 2022-12-07T08:20:55Z

🤔 I was under the impression that urlsplit does the right thing; and this library doesn't.

my 2c: .update(hostname=...) value needs to be examined, if it contains a colon, then wrap it in square brackets.
Edit: and maybe the inner logic of splitting the netloc....

Anyway, that can be a separate PR.
Edit: perhaps maintainers would like a discussion on hacking URLs in this data structure (extra code and maintenance) vs. adopting a 3p URL library (an extra dep).

florimondmanca

LGTM

starlette/datastructures.py

Co-authored-by: Florimond Manca <florimond.manca@protonmail.com>

kousikmitra changed the title ~~Fix url parsing of ipv6 urls #1931~~ #1931 Fix url parsing of ipv6 urls Nov 25, 2022

kousikmitra mentioned this pull request Nov 25, 2022

IPv6 handling bug in starlette.datastructures.URL #1931

Closed

kousikmitra force-pushed the fix/ipv6-url-bug branch 2 times, most recently from 49d5c0a to f248c0c Compare November 25, 2022 15:51

dimaqq reviewed Nov 28, 2022

View reviewed changes

dimaqq reviewed Nov 29, 2022

View reviewed changes

kousikmitra force-pushed the fix/ipv6-url-bug branch from f248c0c to fdeaca4 Compare November 29, 2022 20:03

Kludex added this to the Version 0.23.0 milestone Dec 2, 2022

Fix url parsing of ipv6 urls encode#1931

aedcdc8

kousikmitra force-pushed the fix/ipv6-url-bug branch from 85464a9 to aedcdc8 Compare December 5, 2022 17:36

dimaqq approved these changes Dec 7, 2022

View reviewed changes

Kludex modified the milestones: Version 0.23.0, Version 0.24.0 Dec 8, 2022

Kludex mentioned this pull request Dec 12, 2022

Version 0.24.0 #1983

Merged

7 tasks

florimondmanca approved these changes Feb 5, 2023

View reviewed changes

starlette/datastructures.py Outdated Show resolved Hide resolved

Update starlette/datastructures.py

4e15d59

Co-authored-by: Florimond Manca <florimond.manca@protonmail.com>

Kludex changed the title ~~#1931 Fix url parsing of ipv6 urls~~ Fix url parsing of ipv6 urls on URL.replace Feb 6, 2023

Kludex enabled auto-merge (squash) February 6, 2023 05:40

Merge branch 'master' into fix/ipv6-url-bug

032d171

Kludex merged commit 94a22b8 into encode:master Feb 6, 2023

kousikmitra deleted the fix/ipv6-url-bug branch February 6, 2023 05:58

aminalaee pushed a commit that referenced this pull request Feb 13, 2023

Fix url parsing of ipv6 urls on URL.replace (#1965)

67050d5

Co-authored-by: Florimond Manca <florimond.manca@protonmail.com>

sync-by-unito bot mentioned this pull request Aug 28, 2023

Bump starlette from 0.22.0 to 0.25.0 toddbirchard/blog-webhook-api#294

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix url parsing of ipv6 urls on `URL.replace` #1965

Fix url parsing of ipv6 urls on `URL.replace` #1965

kousikmitra commented Nov 25, 2022

dimaqq Nov 28, 2022

dimaqq Nov 28, 2022

kousikmitra Nov 29, 2022

dimaqq left a comment

kousikmitra commented Nov 29, 2022 •

edited

dimaqq commented Dec 5, 2022 •

edited

adriangb commented Dec 5, 2022

kousikmitra commented Dec 5, 2022

dimaqq commented Dec 6, 2022

dimaqq left a comment

dimaqq commented Dec 7, 2022

kousikmitra commented Dec 7, 2022

dimaqq commented Dec 7, 2022 •

edited

florimondmanca left a comment

Fix url parsing of ipv6 urls on URL.replace #1965

Fix url parsing of ipv6 urls on URL.replace #1965

Conversation

kousikmitra commented Nov 25, 2022

dimaqq Nov 28, 2022

Choose a reason for hiding this comment

dimaqq Nov 28, 2022

Choose a reason for hiding this comment

kousikmitra Nov 29, 2022

Choose a reason for hiding this comment

dimaqq left a comment

Choose a reason for hiding this comment

kousikmitra commented Nov 29, 2022 • edited

dimaqq commented Dec 5, 2022 • edited

adriangb commented Dec 5, 2022

kousikmitra commented Dec 5, 2022

dimaqq commented Dec 6, 2022

dimaqq left a comment

Choose a reason for hiding this comment

dimaqq commented Dec 7, 2022

kousikmitra commented Dec 7, 2022

dimaqq commented Dec 7, 2022 • edited

florimondmanca left a comment

Choose a reason for hiding this comment

Fix url parsing of ipv6 urls on `URL.replace` #1965

Fix url parsing of ipv6 urls on `URL.replace` #1965

kousikmitra commented Nov 29, 2022 •

edited

dimaqq commented Dec 5, 2022 •

edited

dimaqq commented Dec 7, 2022 •

edited