Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Routing][AnnotationClassLoader] fix utf-8 encoding in default route name #31421

Merged
merged 1 commit into from May 18, 2019

Conversation

przemyslaw-bogusz
Copy link
Contributor

Q A
Branch? 3.4
Bug fix? yes
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
License MIT

If controller, or one of its methods, contain a unicode character and you run:

./bin/console debug:router

you get this:
Zrzut ekranu 2019-05-8 o 14 00 48

This PR fixes it into this:
Zrzut ekranu 2019-05-8 o 14 00 59

@Tobion
Copy link
Member

Tobion commented May 8, 2019

I don't see why the Ä got scrambled. strtolower should just not lowercase it but keep it as-is. See https://3v4l.org/scUGN
Also this fix could be considered a BC break because it might change the route names of existing apps.

@nicolas-grekas nicolas-grekas added this to the 3.4 milestone May 8, 2019
@nicolas-grekas
Copy link
Member

I don't see why the Ä got scrambled

I'd suspect because strtolower is locale-sensitive.

We could have something like:

$name = str_replace('\\', '_', $class->name).'_'.$method->name;
$name = \function_exists('mb_strtolower') && preg_match('//u', $name) ? mb_strtolower($name, 'UTF-8') : strtolower($name);

@przemyslaw-bogusz
Copy link
Contributor Author

@Tobion That's very interesting. From my experience, strtolower breaks a string with unicode characters. I guess it indeed depends on locale.

As for the BC break - keeping in mind that each bug fix can be considered a BC break - this method is used only when a route has no name. So, it could potentially be a problem for someone, who uses routing but does not name the routes.

@przemyslaw-bogusz
Copy link
Contributor Author

@nicolas-grekas As I understand, you want to check if the string contains a unicode character? preg_match('//u', $name) returns true even on an empty string - https://3v4l.org/VLNrp. In my experience, comparing strlen to mb_strlen is a good way to do it. But couldn't we just check if mb_strtolower exists and apply it?

@nicolas-grekas
Copy link
Member

But couldn't we just check if mb_strtolower exists and apply it?

that would break if any other encoding is used - mb requires knowing the charset before using it.

@przemyslaw-bogusz
Copy link
Contributor Author

preg_match('//u', $name) as a way to validate encoding - I haven't thought about that one. Motto for this week: 'Learning while contributing'.

@nicolas-grekas
Copy link
Member

LGTM, could you add a test case please?

@nicolas-grekas
Copy link
Member

The tests are failing on appveyor. Windows doesn't like unicode names... We should find a test case that doesn't rely on the filesystem willingness :)

@przemyslaw-bogusz
Copy link
Contributor Author

I've changed the name of the fixture class and now all tests pass. By the way, I think there might be a few more similar unicode issues hidden inside the Symfony code, e.g. method underscore of Container.

@fabpot
Copy link
Member

fabpot commented May 18, 2019

Thank you @przemyslaw-bogusz.

@fabpot fabpot merged commit 7ab52d3 into symfony:3.4 May 18, 2019
fabpot added a commit that referenced this pull request May 18, 2019
…ault route name (przemyslaw-bogusz)

This PR was squashed before being merged into the 3.4 branch (closes #31421).

Discussion
----------

[Routing][AnnotationClassLoader] fix utf-8 encoding in default route name

| Q             | A
| ------------- | ---
| Branch?       | 3.4
| Bug fix?      | yes
| New feature?  | no
| BC breaks?    | no
| Deprecations? | no
| Tests pass?   | yes
| License       | MIT

If controller, or one of its methods, contain a unicode character and you run:
```
./bin/console debug:router
```
you get this:
![Zrzut ekranu 2019-05-8 o 14 00 48](https://user-images.githubusercontent.com/35422131/57374545-71863080-719b-11e9-999e-fe0a5051c089.png)

This PR fixes it into this:
![Zrzut ekranu 2019-05-8 o 14 00 59](https://user-images.githubusercontent.com/35422131/57374616-92e71c80-719b-11e9-9d13-5370213c22f7.png)

Commits
-------

7ab52d3 [Routing][AnnotationClassLoader] fix utf-8 encoding in default route name
This was referenced May 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants