Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative string-based approach to format string passthrough #889

Merged
merged 5 commits into from Nov 15, 2021

Conversation

joerick
Copy link
Contributor

@joerick joerick commented Oct 23, 2021

An alternative approach to, and closes #884 that uses regex instead of format_map.

While working on #884 I felt that there were still quite a lot of gotchas in that implementation i.e.:

  • curly brackets that look like positional replacement fields
  • unmatched curly brackets

These would require the user to provide additional escaping a-la {{ and }}.

This is an experiment to see if a non-format_map solution might work better.

I'm not sure which one I prefer! The main differences are that, in this PR,

  • this approach doesn't have the gotchas mentioned above. Unrecognised syntax is always passed through.
  • the escape sequences are different. This uses hash e.g. #{wheel} to passthrough a literal {wheel}, rather than {{wheel}}
  • the escape sequence only works on variables known to the command. So in fix: pass through unexpected format strings #884, doing {{unknown}} is passed through as {unknown}, but here the escape sequences are only removed on known params. e.g. \{unknown} passes through as \{unknown}, untouched. I think this is less disruptive, as most users won't have to think about the escaping unless they hit a naming conflict directly.
  • If there are any {{ escape sequences in the wild, due to the change in escaping rules, this would break that. However, I can only find one such instance

So I suppose I'm leaning towards this approach 1. What do you think @henryiii?

Footnotes

  1. Despite what I said in Do not use Python's str.format() for expanding placeholders in commands #840, sorry! Sometimes it takes writing the code to know.

Uses a regex-based solution instead of str.format
@henryiii
Copy link
Contributor

I liked this approach better originally, too, and still do. Less interference and less gotchas.

@joerick
Copy link
Contributor Author

joerick commented Oct 26, 2021

I need to look at this CI error on windows in detail, but it occurs to me that backslash (e.g. \{wheel}) might not be a great escape sequence on Windows because backslash is used as a path separator there. Alternative ideas welcome. I suppose we could return to {{wheel}}. Seems a bit esoteric to me, but maybe it's fine.

@henryiii
Copy link
Contributor

henryiii commented Oct 26, 2021

I'd be okay with {{wheel}} being the escape sequence for getting a literal {wheel}. Hopefully it's very rare in the wild. It would mean ${wheel} would be hard to produce in CI scripts (but not in TOML), but that's okay, there's very little reason to use that, and it's much better than it is today.

@joerick
Copy link
Contributor Author

joerick commented Nov 14, 2021

Managed to fix the backslash issue.

Having looked at it, {{wheel}} is actually quite tricky to ignore in regex, because you need to rely on both a negative lookbehind of { and a negative lookahead of } to properly match it. Perhaps #{wheel} would be okay as an escape sequence, what do you think?

@henryiii
Copy link
Contributor

If you fixed the backslash, I would think that's fine. #{} looks like Ruby to be, but admittedly, it would probably be pretty rare for shell scripts, I think. No strong opinion here, I expect escaping this would be very uncommon.

@joerick
Copy link
Contributor Author

joerick commented Nov 14, 2021

Cool. I'm only really doing the escaping thing for completeness, I'd be surprised if it's ever used! I added a note about its use to the docs, too, all the way at the bottom of the Options page.

So this is ready for another review.

# in repl as escape sequences
result = re.sub(
pattern=find_pattern,
repl=lambda _: str(value),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
repl=lambda _: str(value),
repl=value.replace('\\', r'\\'),

FYI, this is the suggestion in the Python docs for escaping here. Don't have a strong preference, just wanted to point it out. See the final suggestion in https://docs.python.org/3/library/re.html#re.escape.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I missed that. Will stick to the lambda for convenience

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants