While creating a sniff for PHPCompatibility to detect the PHP 8.2 deprecation of two of the four syntaxes to embed variables/expressions within text strings, I realized that for select examples of the "type 4" syntax - "Variable variables (“${expr}”, equivalent to (string) ${expr})" -, PHPCS did not, and probably never did, tokenize those correctly.
Double quoted strings in PHPCS are normally tokenized as one token (per line), including any and all embedded variables and expressions to prevent problems in the scope map caused by braces within the string/expression.
However, for some double quoted strings using `${` syntax, this was not the case and these were tokenized as outlined below, which would still lead to the problems in the scope map.
Luckily, these syntax variants aren't used that frequently. Though I suspect if they were, this bug would have been discovered a lot sooner.
Either way, fixed now.
Including adding a dedicated test file based on examples related to the PHP 8.2 RFC.
Note: this test file tests that these embedding syntaxes are tokenized correctly, it doesn't test the other parts of the related `Tokenizers\PHP` code block (re-tokenizing to one token per line, handling of binary casts).
Example of the code for which the tokenizer was buggy:
```php
"${foo["${bar}"]}";
```
Was originally tokenized like so:
```
Ptr | Ln | Col | Cond | ( #) | Token Type | [len]: Content
-------------------------------------------------------------------------
2 | L3 | C 1 | CC 0 | ( 0) | T_ECHO | [ 4]: echo
3 | L3 | C 5 | CC 0 | ( 0) | T_WHITESPACE | [ 1]: ⸱
4 | L3 | C 6 | CC 0 | ( 0) | T_DOUBLE_QUOTED_STRING | [ 8]: "${foo["
5 | L3 | C 14 | CC 0 | ( 0) | T_DOLLAR_OPEN_CURLY_BRACES | [ 2]: ${
6 | L3 | C 16 | CC 0 | ( 0) | T_STRING_VARNAME | [ 3]: bar
7 | L3 | C 19 | CC 0 | ( 0) | T_CLOSE_CURLY_BRACKET | [ 1]: }
8 | L3 | C 20 | CC 0 | ( 0) | T_DOUBLE_QUOTED_STRING | [ 4]: "]}"
9 | L3 | C 24 | CC 0 | ( 0) | T_SEMICOLON | [ 1]: ;
10 | L3 | C 25 | CC 0 | ( 0) | T_WHITESPACE | [ 0]:
```
Will be tokenized (correctly) like so with the fix included in this PR:
```
Ptr | Ln | Col | Cond | ( #) | Token Type | [len]: Content
-------------------------------------------------------------------------
2 | L3 | C 1 | CC 0 | ( 0) | T_ECHO | [ 4]: echo
3 | L3 | C 5 | CC 0 | ( 0) | T_WHITESPACE | [ 1]: ⸱
4 | L3 | C 6 | CC 0 | ( 0) | T_DOUBLE_QUOTED_STRING | [ 18]: "${foo["${bar}"]}"
5 | L3 | C 24 | CC 0 | ( 0) | T_SEMICOLON | [ 1]: ;
6 | L3 | C 25 | CC 0 | ( 0) | T_WHITESPACE | [ 0]:
```
Refs:
* https://www.php.net/manual/en/language.types.string.php#language.types.string.parsing
* https://wiki.php.net/rfc/deprecate_dollar_brace_string_interpolation
* https://gist.github.com/iluuu1994/72e2154fc4150f2258316b0255b698f2