Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path cannot add or init integers, only strings and other Paths #146

Closed
Tracked by #156
MestreLion opened this issue Oct 21, 2021 · 2 comments
Closed
Tracked by #156

Path cannot add or init integers, only strings and other Paths #146

MestreLion opened this issue Oct 21, 2021 · 2 comments

Comments

@MestreLion
Copy link
Contributor

Path.__add__() (and __radd__), as well as __init__(), accepts str and Path, but not int. Considering it can handle ints just fine in subscriptors, why not add support for + too?

Currently we have:

Path("a")["b"]["c"] == Path("a.b.c")
"a" + Path("b") + "c" == Path("a.b.c")
Path("a")[1]["c"] == Path("a[1].c")
Path("[0]")[1]["c"] == Path("[0][1].c")

It could also allow the following:

"a" + Path("[1]") + "b" + 2 == Path("a[1].b[2]")
0 + Path("b") + 1 == Path("[0].b[1]")
0 + Path()[1] + 2 == Path("[0][1][2]")
"a" + Path(1)["b"] + Path(3)[4] + 5 + "c" == Path("a[1].b[3][4][5].c")

I could send a PR if you approve the idea

@vberlier
Copy link
Owner

The idea is that path objects and strings representing path objects should be interchangeable. When you instantiate or concatenate a path with a string, the string represents a full nbt path, not an individual key, the string actually gets parsed into a path object. But when you index into a path you extend it by an individual key. That's why you can use numbers too, the string you use to subscript is a key. Path + str and Path[str] are not equivalent.

>>> Path("a.b") + Path("c.d")
Path('a.b.c.d')
>>> Path("a.b") + "c.d"
Path('a.b.c.d')
>>> Path("a.b")["c.d"]
Path('a.b."c.d"')

To me it makes sense to keep the meaning clear. We can magically make it work like you suggested but it will lead to inconsistencies.

@MestreLion
Copy link
Contributor Author

I really *love Path's design and features, it's undoubtedly one of the most amazing features of nbtlib. But I don't see by your example and explanation why wouldn't int fit that design or create any inconsistency.

If I understood correctly, the design principles you mentioned are:

  • Path(x) constructor takes a full NBT path x, represented by a string or Path.
  • path + x extends a path by concatenating it with another full NBT path x, represented by a string or Path.
  • path[x] indexing appends a single key (or index) x, represented by a string (or int)

Is that correct?

  • What about indexing using another full NBT path? Currently we have Path("a.b")[Path("c.d")] == Path("a.b.c.d"). But anyway, that's not my point.

My question is: what inconsistency would emerge from allowing an int, say 1, to be treated like "[1]"? Sure, an int is only able to represent a single component, and always an index (not a key). So what? That's a limitation of int, and Path shouldn't care. Just like any single component is accepted as a full path, be it "a", "[1]", or even "", so could an int.

My proposal is to handle integers the same way you handle "[x]", no more, no less. You can even convert 1 to "[1]" upfront in __new__, __add__ and __radd__, and let it be parsed as a string.

I believe such approach would not break the API in any way, nor create any inconsistency. The key here (no pun intended) is that an int is always an index

MestreLion added a commit to MestreLion/nbtlib that referenced this issue Oct 23, 2021
As an `int` can only represent a single index, we can use handle `P + int` the same way we handle `P[int]`:
`Path(...) + 1 == Path(...)[1]`
`1 + Path(...) == Path()[1][Path(...)]`

An alternate implementation would be to convert `1` to `"[1]"` and let it parse as a string.

Corresponding tests will be updated in the folowing commits.

The case for `Path(int)` is independent of this and is not addressed in this commit.
MestreLion added a commit to MestreLion/nbtlib that referenced this issue Oct 24, 2021
This effectively treats an integer `x` as if it were the string `"[x]"`, allowing
`Path(x)` and `x + Path(...) + x` in addition to current `Path(...)[x]`.

There are several possible implementation approaches, including a straightforward
`if isinstance(x, int): x = f"[{int(x)}]"` in __new__, __add__ and __radd__,
then letting it parse as an actual string.

However, not only that approach is very inneficient, as it calls the parser and
tokenization for a known outcome, but also this was not the approach taken by
__getitem__, currently the only method that can be used as reference on how to
properly handle integers.

It is also important be consistent with the curent _semantics_ of `Path(x)` and
`Path + x`, as they're different from `Path[x]` for strings and Paths, so do
not converge to using `Path[x]` for both. Even if for integers it ends up being
the same result, as integers can only be a single index, do not rely on that
and keep semantics distinct and consistent with the string code path.

So in __new__, instead of resorting to a `return cls()[x]` shortcut, we call the
same classmethod as str and Path do, `cls.from_accessors()`, passing the same
argument as __getitem__ uses: a tuple with a single element `ListIndex(index=x)`,
which is also, not coincidentaly, the same result of parsing `"[x]"`.

As for __add__/__radd__, again we use the same semantics as used by str:
`Path(...) + x == Path(...)[Path(x)]`, and `x + Path(...) == Path(x)[Path(...)]`
And, as we just defined `Path(x)` for an integer x, we can simply extend the
`isinstance(x, str)` check to include `int`, without creating a new case, thus
enforcing consistent semantics.
MestreLion added a commit to MestreLion/nbtlib that referenced this issue Oct 24, 2021
This effectively treats an integer `x` as if it were the string `"[x]"`, allowing
`Path(x)` and `x + Path(...) + x` in addition to current `Path(...)[x]`.

There are several possible implementation approaches, including a straightforward
`if isinstance(x, int): x = f"[{int(x)}]"` in `__new__`, `__add__` and `__radd__`,
then letting it parse as an actual string.

However, not only that approach is very inefficient, as it calls the parser and
tokenization for a known outcome, but also this was not the approach taken by
`__getitem__`, currently the only method that can be used as reference on how to
properly handle integers.

It is also important be consistent with the current _semantics_ of `Path(x)` and
`Path + x`, as they're different from `Path[x]` for strings and Paths, so do
not converge to using `Path[x]` for both. Even if for integers it ends up being
the same result, as integers can only be a single index, do not rely on that
and keep semantics distinct and consistent with the string code path.

So in `__new__`, instead of resorting to a `return cls()[x]` shortcut, we call the
same classmethod as str and Path do, `cls.from_accessors()`, passing the same
argument as __getitem__ uses: a tuple with a single element `ListIndex(index=x)`,
which is also, not coincidentally, the same result of parsing `"[x]"`.

As for `__add__`/`__radd__`, again we use the same semantics as used by str:
`Path(...) + x == Path(...)[Path(x)]`, and `x + Path(...) == Path(x)[Path(...)]`
And, as we just defined `Path(x)` for an integer x, we can simply extend the
`isinstance(x, str)` check to include `int`, without creating a new case, thus
enforcing consistent semantics.
@vberlier vberlier reopened this Nov 2, 2021
@vberlier vberlier mentioned this issue Nov 2, 2021
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants