Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple segment converters to have tailing segments #2061

Open
CaselIT opened this issue May 7, 2022 · 12 comments
Open

Allow multiple segment converters to have tailing segments #2061

CaselIT opened this issue May 7, 2022 · 12 comments

Comments

@CaselIT
Copy link
Member

CaselIT commented May 7, 2022

The initial version of the converters that support matching multiple paths added in #1945 prevents adding additional segments after the path, meaning that something like /foo/{x:path}/bar is not supported.

Adding support to it is not trivial. A prototype find would be something like this:
(edit: see reply #2061 (comment) regarding how to prevent empty matches)

from dataclasses import dataclass, field
from itertools import count
from falcon.routing import CompiledRouter
from falcon.routing.converters import PathConverter

seq = count(1)

@dataclass
class Tmp:
    id: int = field(default_factory=seq.__next__)

    def on_get(self, req, resp, **kw):
        resp.media = {'id': self.id, 'kw': kw}

r = CompiledRouter()

def add(url, res):
    print(res, url.replace('{x}', '{x:path}'))
    r.add_route(url, res)

add('/foo/{x}', Tmp())
add('/foo/{x}/foo', Tmp())
add('/foo/{x}/{baz:int}', Tmp())
add('/foo/{x}/foo/bar', Tmp())
add('/foo/{x}/{baz:int}/baz', Tmp())

r._compile()
# print(r.finder_src)

def find(path, return_values, patterns, converters, params):
    converters = [PathConverter(), *converters]
    rv = {r.resource.id: r for r in return_values}
    path_len = len(path)
    if path_len > 0:
        if path[0] == 'foo':
            if path_len > 1:
                remaining_path = path[1:]
                remaining_path_len = len(remaining_path)
                if remaining_path_len > 1:
                    if remaining_path[-1] == 'bar':
                        if remaining_path[-2] == 'foo':
                            fragment = remaining_path[:-2]
                            field_value_1 = converters[0].convert(fragment)
                            if field_value_1 is not None:
                                params['x'] = field_value_1
                                return rv[4]
                    if remaining_path[-1] == 'baz':
                        fragment = remaining_path[-2]
                        field_value_2 = converters[1].convert(fragment)
                        if field_value_2 is not None:
                            fragment = remaining_path[:-2]
                            field_value_1 = converters[0].convert(fragment)
                            if field_value_1 is not None:
                                params['baz'] = field_value_2
                                params['x'] = field_value_1
                                return rv[5]
                if remaining_path_len > 0:
                    if remaining_path[-1] == 'foo':
                        fragment = remaining_path[:-1]
                        field_value_1 = converters[0].convert(fragment)
                        if field_value_1 is not None:
                            params['x'] = field_value_1
                            return rv[2]
                    fragment = remaining_path[-1]
                    field_value_2 = converters[1].convert(fragment)
                    if field_value_2 is not None:
                        fragment = remaining_path[:-1]
                        field_value_1 = converters[0].convert(fragment)
                        if field_value_1 is not None:
                            params['baz'] = field_value_2
                            params['x'] = field_value_1
                            return rv[3]
                field_value_1 = converters[0].convert(remaining_path)
                if field_value_1 is not None:
                    params['x'] = field_value_1
                    return rv[1]
                return None
            return None
        return None
    return None

r._find = find

def go(url):
    res = r.find(url)
    if res is None:
        print(url, res)
    resource, _, param, _ = res
    print(resource, param, url)

print('-' * 20)
go('/foo/foo')
go('/foo/bar')
go('/foo/bar/123')
go('/foo/bar/123/x')
go('/foo/bar/123/x/foo')
go('/foo/foo/bar/123/x/foo')
go('/foo/bar/123/x/foo/bar')
go('/foo/123/baz')
go('/foo/bar/123/baz')
go('/foo/bar/123/x/foo/bar/123/baz')

running the above yields

Tmp(id=1) /foo/{x:path}
Tmp(id=2) /foo/{x:path}/foo
Tmp(id=3) /foo/{x:path}/{baz:int}
Tmp(id=4) /foo/{x:path}/foo/bar
Tmp(id=5) /foo/{x:path}/{baz:int}/baz
--------------------
Tmp(id=2) {'x': ''} /foo/foo
Tmp(id=1) {'x': 'bar'} /foo/bar
Tmp(id=3) {'baz': 123, 'x': 'bar'} /foo/bar/123
Tmp(id=1) {'x': 'bar/123/x'} /foo/bar/123/x
Tmp(id=2) {'x': 'bar/123/x'} /foo/bar/123/x/foo
Tmp(id=2) {'x': 'foo/bar/123/x'} /foo/foo/bar/123/x/foo
Tmp(id=4) {'x': 'bar/123/x'} /foo/bar/123/x/foo/bar
Tmp(id=5) {'baz': 123, 'x': ''} /foo/123/baz
Tmp(id=5) {'baz': 123, 'x': 'bar'} /foo/bar/123/baz
Tmp(id=5) {'baz': 123, 'x': 'bar/123/x/foo/bar'} /foo/bar/123/x/foo/bar/123/baz

A decision to take would be what to do regarding supporting multiple "consume multiple segments" in a single url.
Like what would /foo/{bar:path}/{baz:path} match in case of /foo/bar/baz?
I think the issue would be there also if they two converter are separated by a literal segment or normal converted, since this template
/foo/{bar:path}/x/{baz:path} would still be ambiguous for a route like /foo/bar/x/y/x/other/part

ref: #1945 (comment)

@CaselIT CaselIT changed the title Allow multiple segment converter to have tailing segments Allow multiple segment converters to have tailing segments May 7, 2022
@vytas7
Copy link
Member

vytas7 commented May 7, 2022

One unambiguous way slightly extending your prototype could be still supporting only one field of type path, but it could be placed anywhere in the template. Like /{foo:uuid}/{bar:path}/baz/{peb}/{cak}.

@CaselIT
Copy link
Member Author

CaselIT commented May 7, 2022

I personally think this should be a follow on PR, since it's not trivial to support a case like your example

@vytas7
Copy link
Member

vytas7 commented May 7, 2022

Yes, absolutely, I meant splitting or simplifying this issue, not by further complicating #1945.

@CaselIT
Copy link
Member Author

CaselIT commented May 7, 2022

Oh ok, I misunderstood.

I don't feel supporting two or more multiple segment converter is that much harder than supporting tailing segments.
In any case, even if we decide that the fist (or last) one "wins" I'm currently of the option that we should not allow them since it may be confusing on the users. I guess the main issue is the that path allows everything, if we do end up having a repath that would probably make more sense?

Still I agree with you that we could go one step at the time, and have fist a version that does not support multiple multi-segment converters

@CaselIT
Copy link
Member Author

CaselIT commented May 10, 2022

copying from the PR:

@kgriffs

There is a decision to take here: should a "consume multiple segments" converter be allowed to match 0 segments? the current implementation allows it.

Personally I would expect it to never match if there is no path to consume, because otherwise what's the point of using the path converter? Or at least, if there is another route registered for the exact path prefix, that one would take precedence over the route containing the path converter.

One could argue it my be useful to be able to handle /foo/bar/bang and /foo/bang` within the same resource controller, but that doesn't feel like a good API design to me (though it may be useful for re-implementing a legacy API?). So perhaps we could introduce an option to allow 0 segments but it is disabled by default? Or we introduce that along with #2061 ?


@vytas7
If I understand the problem right, IMHO, for the sake of consistency, the new converter should behave like fields behave now. They do match an empty segment after a trailing slash, and they do not otherwise.

Given the following test app,

import falcon

class Resource:
    def on_get(self, req, resp, resourceid):
        resp.media = {'path': req.path, 'resourceid': resourceid}

(app := falcon.App()).add_route('/api/{resourceid}', Resource())

Cf

$ xh http://localhost:8000/api/
HTTP/1.1 200 OK
Content-Length: 35
Content-Type: application/json

{
    "path": "/api/",
    "resourceid": ""
}

vs

$ xh http://localhost:8000/api
HTTP/1.1 404 Not Found
Content-Length: 26
Content-Type: application/json
Vary: Accept

{
    "title": "404 Not Found"
}

@CaselIT
Also note that my question was only once we add support for additional parts after the path. Basically the question what if
/foo/{bar:path}/baz should match /foo/baz or not.


@vytas7
Aha, right, I didn't realize the question was only about when we lift the requirement for a path segment to be last. Then maybe we don't need to make a decision right now?

In any case, I would perceive routing as strange and confusing if /foo/{bar:path}/baz matched /foo/baz, it wouldn't be consistent with the current behaviour where a field essentially needs a preceding / to match, and in this case the path field would rob /baz of that slash. So my vote is on "it shouldn't match".

@CaselIT
Copy link
Member Author

CaselIT commented May 10, 2022

Aha, right, I didn't realize the question was only about when we lift the requirement for a path segment to be last. Then maybe we don't need to make a decision right now?

In any case, I would perceive routing as strange and confusing if /foo/{bar:path}/baz matched /foo/baz, it wouldn't be consistent with the current behaviour where a field essentially needs a preceding / to match, and in this case the path field would rob /baz of that slash. So my vote is on "it shouldn't match".

The issue with shouldn't match is that /foo//baz should by the rule above, and it seems strange.

@vytas7
Copy link
Member

vytas7 commented May 10, 2022

/foo//baz theoretically should match, yes, but IIRC we explicitly disallow two slashes in a row.

@CaselIT
Copy link
Member Author

CaselIT commented May 10, 2022

note that this is the current behaviour:
/foo/{bar}/baz matches /foo//baz but not /foo/baz

@vytas7
Copy link
Member

vytas7 commented May 10, 2022

Aha, it does... that's maybe oversight too, we do disallow a double slash in the templates at least, but maybe not when matching...

Well, in either case we should follow the current behaviour, and IMHO /foo/{bar:converter}/baz SHOULD NOT match /foo/baz REGARDLESS of the converter's type. But just IMHO 🙂

@CaselIT
Copy link
Member Author

CaselIT commented May 10, 2022

I guess that in that sense we should keep this behavior, so the code above would not work. We probably need to check that remaining_path_len has at lest 1+ other segments, so that there is always one segment for the path converter

@CaselIT
Copy link
Member Author

CaselIT commented May 10, 2022

yes, replacing remaining_path_len > 1: with remaining_path_len > 2: and remaining_path_len > 0 with remaining_path_len > 1 produce the appropriate behaviour

@CaselIT
Copy link
Member Author

CaselIT commented May 10, 2022

Well, in either case we should follow the current behaviour, and IMHO /foo/{bar:converter}/baz SHOULD NOT match /foo/baz REGARDLESS of the converter's type. But just IMHO 🙂

I tend to agree, considering it's the current behaviour of the framework and changing that would entail quite a bit more work, since it's not easy to communicate. It may be unfortunate that // does produce different result, but it's still fine for me.

Should we document it?

@vytas7 vytas7 added this to the Triaged (Non-Breaking Changes) milestone Jan 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants