Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validating "slug" and deny line ending and the end. #1148

Closed
jedie opened this issue Aug 1, 2023 · 6 comments
Closed

Validating "slug" and deny line ending and the end. #1148

jedie opened this issue Aug 1, 2023 · 6 comments
Labels
Invalid Not a bug, PEBKAC, or an unsupported setup

Comments

@jedie
Copy link

jedie commented Aug 1, 2023

A "slug" regular expression can look like this: r'^[-a-zA-Z0-9_]+\Z'

But \Z can't be used in JSON schema, because the regular expression syntax used from JavaScript and they doesn't support \Z.
Other libraries like https://github.com/swaggest/php-json-schema has a check for this case, e.g.:

        if (substr($data, -2) === '\Z') {
            return 'Invalid regex: \Z is not supported';
        }

https://github.com/swaggest/php-json-schema/blob/2d7b8e22a9de2680b8639404502df0b24b834da6/src/Constraint/Format.php#L111-L113

This is missing in jsonschema.

If we replace \Z with $ (e.g.: r'^[-a-zA-Z0-9_]+\Z') then a string with line ending like foo\n is valid because jsonschema seems not use use re.DOTALL, isn't it?

Or did i miss something?

tl;td;

  • \Z should not be accepted
  • re.DOTALL should be used for regex
@Julian
Copy link
Member

Julian commented Aug 1, 2023

The specification recommends but does not require you use JavaScript syntax. In particular, if you know all the consumers of your schema it may be you want to use Python syntax, and/or to put your flag inside the regex.

@jedie
Copy link
Author

jedie commented Aug 1, 2023

Yes, but one of our schema consumer is a PHP project ;) So we can't use \Z :(

@Julian
Copy link
Member

Julian commented Aug 1, 2023

I'm not sure what else to offer then, you need to pick a common subset of regex syntax that works for what you're using.

\Z is valid in Python, so there's no reason to disallow it, and re.DOTALL isn't the default in Python so there's no reason to enable it, but you're welcome to enable it inside your own regex if you want it on and your other consumers are OK with it.

Otherwise you might be interested in #1142 but that's something off in the future, and you'd still need to find a PHP regex implementation for Python or a JS implementation for PHP.

@jedie
Copy link
Author

jedie commented Aug 1, 2023

you're welcome to enable it inside your own regex

Oh, is this possible? How?

@Julian
Copy link
Member

Julian commented Aug 1, 2023

See https://docs.python.org/3/library/re.html#regular-expression-syntax, specifically (?s) IIRC.

@Julian Julian closed this as completed Aug 1, 2023
@jedie
Copy link
Author

jedie commented Aug 2, 2023

See https://docs.python.org/3/library/re.html#regular-expression-syntax, specifically (?s) IIRC.

EDIT: I tested r'(?s)^[-a-zA-Z0-9_]+$' but foo\n is still valid. Thus, my statement that DOTALL is a solution is false :(

I don't see a way to solve the problem.

@Julian Julian added the Invalid Not a bug, PEBKAC, or an unsupported setup label Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Invalid Not a bug, PEBKAC, or an unsupported setup
Projects
None yet
Development

No branches or pull requests

2 participants