Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple -e options behave differently in tox 4.x vs tox 3.x #3199

Open
rhertzog opened this issue Jan 23, 2024 · 18 comments
Open

Multiple -e options behave differently in tox 4.x vs tox 3.x #3199

rhertzog opened this issue Jan 23, 2024 · 18 comments
Labels
help:wanted Issues that have been acknowledged, a solution determined and a PR might likely be accepted.

Comments

@rhertzog
Copy link

Issue

tox -e env1 -e env2 runs the two environments in tox 3.x but runs only env2 in tox 4.x.

I don't know if that change is intentional but it's sure to break quite a few CI that rely on -e being additive rather than overriding a former value. I would suggest to either fix the behaviour to be consistent with 3.x or put some strong warning in the changelog and documentation that one needs to use the comma separated syntax: tox -e env1,env2

Environment

Provide at least:

  • OS: Debian Linux (unstable)
$ tox --version
4.12.0 from /usr/lib/python3/dist-packages/tox/__init__.py

Thank you for maintaining tox! It's a great tool.

@cjwatson
Copy link

I've run into this multiple times in different projects, and it always trips me up. (I'm currently working on the same project as @rhertzog and suggested that he file this issue.)

@webknjaz
Copy link
Contributor

v4 was a huge multi-year rewrite with a lot of breaking changes. I'm pretty sure most are expected. Probably worth mentioning @ https://tox.wiki/en/4.12.1/upgrading.html#updating-usage-with-e though.

@jugmac00
Copy link
Member

I can confirm that this has changed. To me it looks like a regression and I see no obvious reason why we should not re-introduce it for tox 4.

@gaborbernat
Copy link
Member

A PR fixing this would be accepted.

@0cjs
Copy link
Contributor

0cjs commented Jan 25, 2024

I've not dug into the old v3 code, but I'm assuming that it did something along the lines of add_argument('-e', ..., action='append', default=[]). I.e., it collected all of the -e arguments into a list, and that list was used directly as the list of environment names. This would not allow a comma-separated list such as -e foo,bar; that would produce a single envname foo,bar.

The current code I suppose introduced the CliEnv class (session/env_select.py), used as add_argument("-e", ..., type=CliEnv). There are no direct tests or documentation for CliEnv, but I am guessing that CliEnv(s) converts the comma-separated envnames to a list[str] of the envnames.

The issue here is that with the argument value being a single CliEnv value, using -e will wipe out any previous values supplied with -e.

argparse doesn't offer any way to have a type= parameter work on both the current value and any independent previous values for that argument, as far as I can tell, so the only option to fix this seems to be to go back to having the value be a list (or really, a mutable sequence) so that we can use action='append'.

We can't use a list of str here because register_env_select_flags() ( also in session/env_select.py) takes an optional default value for -e that is a CliEnv, and also returns a CliEnv (via type=CliEnv) so post-processing of the arguments is not required.

The first idea I had was to change the value of -e from a CliEnv to a list[CliEnv], and then after parsing has been done, use a function to combine all the CliEnvs together. But that introduces post-processing of the values.

We also need need to preserve the behaviour where the default is replaced by any -e values, if they are specified, but otherwise left alone. As it turns out, argparse can support this under certain circumstances: if instead of default=[something], action='append' we use default=something, action='append', and something is not a sequence, it will be replaced with a list by the first -e, and then subsequent -es will add to that list.

Unfortunately, testing shows this won't work for a default that's a CliEnv because argparse decides that that is a sequence, tries to call append on it, and fails.

But this does give us a workaround, I think: we can make CliEnv mutable and add an append(c) method that takes a CliEnv and appends its contents to itself, dropping any default value it had if it's the first call to append(). Then we can continue to use type=CliEnv, default=CliEnv() (or whatever default the caller provides) and need no changes outside CliEnv and register_env_select_flags().

That may be a bit hacky, but the other approach I can think of involves doing extra processing external to CliEnv and register_env_select_flags() after parse_args() has completed, which spreads the code around more and so is probably even worse.

Anyway, if my idea of mutable CliEnv with append(c) seems reasonable, I'd be willing to look at doing a PR for this.

@gaborbernat
Copy link
Member

Anyway, if my idea of mutable CliEnv with append(c) seems reasonable, I'd be willing to look at doing a PR for this.

I'm fine with this; however, I'd like to disable mutating CliEnv ideally post config session 🤔

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

Yeah, I thought about disabling mutations of CliEnv, but decided that the extra complexity is not worth it. This complexity comes in two areas: first, it makes the class itself more complex because we need further state to determine whether the mutation methods are allowed to be called or not. Second, it adds extra complexity to the clients of the class because they must toggle this state at the appropriate time. And worse yet for the second one, that involves tracking down all the uses of the class and modifying and testing them.

Given that in the new arrangement I suggested above a CliEnv object can be mutated only through the append() method, which we know is currently never called because it doesn't exist, I feel fairly safe in making the class mutable only through that method.

Speaking of, "when in doubt, keep things simple," it also seems to me that CliEnv would be better off simply using split(',') on the constructor argument. Right now it does StrConvert().to(value, of_type=List[str], factory=None), thus relying on the fairly complex (~120 lines) StrConvert class, which in turn is a subclass of the even more complex Convert (~160 lines). Especially since we have no direct tests on CliEnv, tracing through all this code currently the only way to see exactly what all this is doing, which is a lot of error-prone work.

@gaborbernat
Copy link
Member

gaborbernat commented Jan 26, 2024

There's value in keeping the CLI and str parser in sync. Furthermore any kind of backward incompatible change here is out of the question for me personally. Newlines are possible in the CLI, e.g. -e 'a\nb' is valid and supported and I have other examples too. It's not just,.

We have a finalize config for this reason, it should be not difficult to do.

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

For the split(',') thing, I'm not proposing any change to the behaviour of CliEnv. (Or, at least, I think I'm not; the whole reason I can't be sure is because it's wrapped up into interdependencies with a large pile of complex code, the only tests for it also go through other piles of complex code, and the API is undocumented.)

I don't see the value in having CliEnv connected to StrConvert at all. CliEnv appears (see above) to have a pretty simple task to perform, and a) not being affected by changes to other parts of the system will increase its isolation and thus reliability, and b) being a lot more clear about what it does decreases the chances of someone accidentally breaking it.

It could well be that there's subtle behaviour in there that I'm missing. If so, can you write a test that exposes that, so we know what it is and that it won't get broken?

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

I notice that specifying zero-length environment names to the -e option treats them as ignored arguments, e.g., -e ,,py312,,,fix,, happily runs just py312 and fix. Is this intended behaviour, and should it be preserved?

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

@gaborbernat Regarding the formatting of test vectors, I find a lists of these considerably easier to read when they're nicely lined up in columns:

@pytest.mark.parametrize('userinput, envnames', [
    ('a1',                  ('a1',)),
    ('a1,b2,c3',            ('a1', 'b2', 'c3')),
    #   Zero-length environment names are ignored as being not present. It's not clear if this behaviour is intentional.
    (',,a1,,,b2,,',         ('a1', 'b2')),
    (',,',                  ()),
    #   Environment names with "invalid" characters are accepted here; the client is expected to deal with this.
    ('\x01.-@\x02,xxx',     ('\x01.-@\x02', 'xxx')),
    ]
)

However, tox -e fix reformats these to squash the parameters together, so that you need to look more carefully at all the commas and parens to parse it:

@pytest.mark.parametrize(
    ("userinput", "envnames"),
    [
        ("a1", ("a1",)),
        ("a1,b2,c3", ("a1", "b2", "c3")),
        #   Zero-length environment names are ignored as being not present. It's not clear if this behaviour is intentional.
        (",,a1,,,b2,,", ("a1", "b2")),
        (",,", ()),
        #   Environment names with "invalid" characters are accepted here; the client is expected to deal with this.
        ("\x01.-@\x02,xxx", ("\x01.-@\x02", "xxx")),
    ],
)

Is the latter really what you want?

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

We have a finalize config for this reason, it should be not difficult to do.

I just noticed your edit to add this. What is the "finalize config" and how does it work?

@gaborbernat
Copy link
Member

@gaborbernat Regarding the formatting of test vectors, I find a lists of these considerably easier to read when they're nicely lined up in columns:

@pytest.mark.parametrize('userinput, envnames', [
    ('a1',                  ('a1',)),
    ('a1,b2,c3',            ('a1', 'b2', 'c3')),
    #   Zero-length environment names are ignored as being not present. It's not clear if this behaviour is intentional.
    (',,a1,,,b2,,',         ('a1', 'b2')),
    (',,',                  ()),
    #   Environment names with "invalid" characters are accepted here; the client is expected to deal with this.
    ('\x01.-@\x02,xxx',     ('\x01.-@\x02', 'xxx')),
    ]
)

However, tox -e fix reformats these to squash the parameters together, so that you need to look more carefully at all the commas and parens to parse it:

@pytest.mark.parametrize(
    ("userinput", "envnames"),
    [
        ("a1", ("a1",)),
        ("a1,b2,c3", ("a1", "b2", "c3")),
        #   Zero-length environment names are ignored as being not present. It's not clear if this behaviour is intentional.
        (",,a1,,,b2,,", ("a1", "b2")),
        (",,", ()),
        #   Environment names with "invalid" characters are accepted here; the client is expected to deal with this.
        ("\x01.-@\x02,xxx", ("\x01.-@\x02", "xxx")),
    ],
)

Is the latter really what you want?

Yes 🤷‍♂️ this is one of those few cases where for me consistency trumps the benefits.

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

Ok, the squished formatting it is. Still waiting for answers on the other, more important questions, particularly the zero-length envname one.

@gaborbernat
Copy link
Member

I don't have a strong feeling about that. I feel like raising an error, not running anything or considering it as use the default values all the same equally right.

@0cjs
Copy link
Contributor

0cjs commented Jan 26, 2024

Ok. I think that raising an error is the best course: if the user's intent isn't clear, this asks him to clarify it. I expect the most likely reason for the user supplying something that looks like it has zero-length envnames is that he forgot to finish what he was typing (or just seriously misunderstands what he's doing).

I've just confirmed that argparse for type=int converts ValueError: invalid literal for int() with base 10: '1.2.3' to error: argument -i: invalid int value: '1.2.3'. Thus, chucking a ValueError from CliEnv will produce ...invalid CliEnv value: ',,foo,,', which is probably clear enough.

@0cjs 0cjs mentioned this issue Jan 26, 2024
5 tasks
@0cjs
Copy link
Contributor

0cjs commented Jan 27, 2024

@gaborbernat Where should one put a test that can check -e foo,bar vs. -e foo -e bar, i.e., going through ArgumentParser?l

@gaborbernat
Copy link
Member

Up to you 👍

@gaborbernat gaborbernat added the help:wanted Issues that have been acknowledged, a solution determined and a PR might likely be accepted. label Mar 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help:wanted Issues that have been acknowledged, a solution determined and a PR might likely be accepted.
Projects
None yet
Development

No branches or pull requests

6 participants