Skip to content

Commit

Permalink
Tokenizer/PHP: fix tokenization of the default keyword
Browse files Browse the repository at this point in the history
As per: #3326 (comment)

> After `PHP::tokenize()`, the `DEFAULT` is still tokenized as `T_DEFAULT`. This causes the `Tokenizer::recurseScopeMap()` to assign it as the `scope_opener` to the `;` semi-colon at the end of the constant declaration, with the class close curly brace being set as the `scope_closer`.
> In the `PHP::processAdditional()` method, the `DEFAULT` is subsequently retokenized to `T_STRING` as it is preceded by a `const` keyword, but that is too late.
>
> The `scope_opener` being set on the semi-colon is what is causing the errors to be displayed for the above code sample.

The commit fixes this by:
1. Abstracting the list of `T_STRING` contexts out to a class property.
2. Using the list from the property in all places in the `Tokenizer\PHP` class where keyword tokens are (potentially) being re-tokenized to `T_STRING`, including in the `T_DEFAULT` tokenization code which was added to address the PHP 8.0 `match` expressions.
    Note: the issue was not introduced by `match` related code, however, that code being there does make it relatively easy now to fix this particular case.

While this doesn't address 3336 yes, it is a step towards addressing it and will sort out one of the most common causes for bugs.
  • Loading branch information
jrfnl authored and gsherwood committed Jun 30, 2021
1 parent 9022411 commit 61dcfd6
Showing 1 changed file with 27 additions and 44 deletions.
71 changes: 27 additions & 44 deletions src/Tokenizers/PHP.php
Expand Up @@ -462,6 +462,29 @@ class PHP extends Tokenizer
T_TYPE_UNION => 1,
];

/**
* Contexts in which keywords should always be tokenized as T_STRING.
*
* @var array
*/
protected $tstringContexts = [
T_OBJECT_OPERATOR => true,
T_NULLSAFE_OBJECT_OPERATOR => true,
T_FUNCTION => true,
T_CLASS => true,
T_INTERFACE => true,
T_TRAIT => true,
T_EXTENDS => true,
T_IMPLEMENTS => true,
T_ATTRIBUTE => true,
T_NEW => true,
T_CONST => true,
T_NS_SEPARATOR => true,
T_USE => true,
T_NAMESPACE => true,
T_PAAMAYIM_NEKUDOTAYIM => true,
];

/**
* A cache of different token types, resolved into arrays.
*
Expand Down Expand Up @@ -1177,16 +1200,7 @@ protected function tokenize($string)
break;
}

$notMatchContext = [
T_PAAMAYIM_NEKUDOTAYIM => true,
T_OBJECT_OPERATOR => true,
T_NULLSAFE_OBJECT_OPERATOR => true,
T_NS_SEPARATOR => true,
T_NEW => true,
T_FUNCTION => true,
];

if (isset($notMatchContext[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
if (isset($this->tstringContexts[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
// Also not a match expression.
break;
}
Expand Down Expand Up @@ -1234,14 +1248,7 @@ protected function tokenize($string)
if ($tokenIsArray === true
&& $token[0] === T_DEFAULT
) {
$ignoreContext = [
T_OBJECT_OPERATOR => true,
T_NULLSAFE_OBJECT_OPERATOR => true,
T_NS_SEPARATOR => true,
T_PAAMAYIM_NEKUDOTAYIM => true,
];

if (isset($ignoreContext[$finalTokens[$lastNotEmptyToken]['code']]) === false) {
if (isset($this->tstringContexts[$finalTokens[$lastNotEmptyToken]['code']]) === false) {
for ($x = ($stackPtr + 1); $x < $numTokens; $x++) {
if ($tokens[$x] === ',') {
// Skip over potential trailing comma (supported in PHP).
Expand Down Expand Up @@ -1718,25 +1725,7 @@ function return types. We want to keep the parenthesis map clean,
if ($tokenIsArray === true && $token[0] === T_STRING) {
// Some T_STRING tokens should remain that way
// due to their context.
$context = [
T_OBJECT_OPERATOR => true,
T_NULLSAFE_OBJECT_OPERATOR => true,
T_FUNCTION => true,
T_CLASS => true,
T_INTERFACE => true,
T_TRAIT => true,
T_EXTENDS => true,
T_IMPLEMENTS => true,
T_ATTRIBUTE => true,
T_NEW => true,
T_CONST => true,
T_NS_SEPARATOR => true,
T_USE => true,
T_NAMESPACE => true,
T_PAAMAYIM_NEKUDOTAYIM => true,
];

if (isset($context[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
if (isset($this->tstringContexts[$finalTokens[$lastNotEmptyToken]['code']]) === true) {
// Special case for syntax like: return new self
// where self should not be a string.
if ($finalTokens[$lastNotEmptyToken]['code'] === T_NEW
Expand Down Expand Up @@ -2615,13 +2604,7 @@ protected function processAdditional()
}
}

$context = [
T_OBJECT_OPERATOR => true,
T_NULLSAFE_OBJECT_OPERATOR => true,
T_NS_SEPARATOR => true,
T_PAAMAYIM_NEKUDOTAYIM => true,
];
if (isset($context[$this->tokens[$x]['code']]) === true) {
if (isset($this->tstringContexts[$this->tokens[$x]['code']]) === true) {
if (PHP_CODESNIFFER_VERBOSITY > 1) {
$line = $this->tokens[$i]['line'];
$type = $this->tokens[$i]['type'];
Expand Down

0 comments on commit 61dcfd6

Please sign in to comment.