Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disappearing comments #950

Open
pmjones opened this issue Sep 28, 2023 · 4 comments
Open

Disappearing comments #950

pmjones opened this issue Sep 28, 2023 · 4 comments

Comments

@pmjones
Copy link

pmjones commented Sep 28, 2023

BLUF: The 4.x parser appears to lose comments in some cases; while their removal has no effect on the logical operation of code rebuilt from the AST, their disappearance is surprising when using something like PHP-Styler.


Given code like the following ...

<?php
switch ($foo) {
    /* this comment disappears */
}

function bar(/* this comment disappears */)
{
}

... the comments noted do not appear at all in the array of parsed nodes:

array(2) {
  [0]=>
  object(PhpParser\Node\Stmt\Switch_)#641 (3) {
    ["attributes":protected]=>
    array(2) {
      ["startLine"]=>
      int(2)
      ["endLine"]=>
      int(4)
    }
    ["cond"]=>
    object(PhpParser\Node\Expr\Variable)#639 (2) {
      ["attributes":protected]=>
      array(2) {
        ["startLine"]=>
        int(2)
        ["endLine"]=>
        int(2)
      }
      ["name"]=>
      string(3) "foo"
    }
    ["cases"]=>
    array(0) {
    }
  }
  [1]=>
  object(PhpParser\Node\Stmt\Function_)#640 (8) {
    ["attributes":protected]=>
    array(2) {
      ["startLine"]=>
      int(6)
      ["endLine"]=>
      int(8)
    }
    ["byRef"]=>
    bool(false)
    ["name"]=>
    object(PhpParser\Node\Identifier)#642 (2) {
      ["attributes":protected]=>
      array(2) {
        ["startLine"]=>
        int(6)
        ["endLine"]=>
        int(6)
      }
      ["name"]=>
      string(3) "bar"
    }
    ["params"]=>
    array(0) {
    }
    ["returnType"]=>
    NULL
    ["stmts"]=>
    array(0) {
    }
    ["attrGroups"]=>
    array(0) {
    }
    ["namespacedName"]=>
    NULL
  }
}

Is it possible to retain them somehow, so that PHP-Styler can reproduce them faithfully when pretty-printing the AST?

@nikic
Copy link
Owner

nikic commented Sep 28, 2023

The core problem here is that there is no place to put these comments. We do retain comments in cases like

if ($x) {
    // comment
}

by inserting a Stmt\Nop node that carries the comment. However, in the two cases above, Switch_::$cases only accepts Cases and Function_::$params only accepts Params, so we can't insert a nop statement there, at least not without breaking the type system.

I would love to retain these comments somehow, but I'm not sure how this could be represented.

It is possible to recover these comments from the tokens, e.g. #382 (comment) has some sample code to get all "interior" comments of a node (i.e. directly in the node, not part of a subnode). I think it should basically still work though can be a bit simplified nowadays. That just gives you the comments though, not where exactly they occurred, so it might not be useful for your use-case.

@pmjones
Copy link
Author

pmjones commented Sep 28, 2023

Thanks for the pointers @nikic -- I will try out the tokens and report back.

@pmjones
Copy link
Author

pmjones commented Oct 3, 2023

Have not tried the tokens yet, but wanted to point this out in the mean time: the final comment on the final array element disappears; the parser does not appear to retain it. This may be a different problem from the above; if so, I can open a separate issue.

$map = [
    34 => 'quot', // quotation mark
    38 => 'amp', // ampersand
    60 => 'lt', // less-than sign
    62 => 'gt', // greater-than sign -- this comment disappears
];

The parser nodes and output:

Parser Nodes: array(1) {
  [0]=>
  object(PhpParser\Node\Stmt\Expression)#658 (2) {
    ["attributes":protected]=>
    array(2) {
      ["startLine"]=>
      int(2)
      ["endLine"]=>
      int(7)
    }
    ["expr"]=>
    object(PhpParser\Node\Expr\Assign)#657 (3) {
      ["attributes":protected]=>
      array(2) {
        ["startLine"]=>
        int(2)
        ["endLine"]=>
        int(7)
      }
      ["var"]=>
      object(PhpParser\Node\Expr\Variable)#639 (2) {
        ["attributes":protected]=>
        array(2) {
          ["startLine"]=>
          int(2)
          ["endLine"]=>
          int(2)
        }
        ["name"]=>
        string(3) "map"
      }
      ["expr"]=>
      object(PhpParser\Node\Expr\Array_)#656 (2) {
        ["attributes":protected]=>
        array(4) {
          ["startLine"]=>
          int(2)
          ["endLine"]=>
          int(7)
          ["kind"]=>
          int(2)
          ["expansive"]=>
          bool(true)
        }
        ["items"]=>
        array(4) {
          [0]=>
          object(PhpParser\Node\Expr\ArrayItem)#642 (5) {
            ["attributes":protected]=>
            array(2) {
              ["startLine"]=>
              int(3)
              ["endLine"]=>
              int(3)
            }
            ["key"]=>
            object(PhpParser\Node\Scalar\LNumber)#640 (2) {
              ["attributes":protected]=>
              array(4) {
                ["startLine"]=>
                int(3)
                ["endLine"]=>
                int(3)
                ["rawValue"]=>
                string(2) "34"
                ["kind"]=>
                int(10)
              }
              ["value"]=>
              int(34)
            }
            ["value"]=>
            object(PhpParser\Node\Scalar\String_)#641 (2) {
              ["attributes":protected]=>
              array(4) {
                ["startLine"]=>
                int(3)
                ["endLine"]=>
                int(3)
                ["kind"]=>
                int(1)
                ["rawValue"]=>
                string(6) "'quot'"
              }
              ["value"]=>
              string(4) "quot"
            }
            ["byRef"]=>
            bool(false)
            ["unpack"]=>
            bool(false)
          }
          [1]=>
          object(PhpParser\Node\Expr\ArrayItem)#646 (5) {
            ["attributes":protected]=>
            array(3) {
              ["startLine"]=>
              int(4)
              ["comments"]=>
              array(1) {
                [0]=>
                object(PhpParser\Comment)#643 (7) {
                  ["text":protected]=>
                  string(17) "// quotation mark"
                  ["startLine":protected]=>
                  int(3)
                  ["startFilePos":protected]=>
                  int(33)
                  ["startTokenPos":protected]=>
                  int(14)
                  ["endLine":protected]=>
                  int(3)
                  ["endFilePos":protected]=>
                  int(49)
                  ["endTokenPos":protected]=>
                  int(14)
                }
              }
              ["endLine"]=>
              int(4)
            }
            ["key"]=>
            object(PhpParser\Node\Scalar\LNumber)#644 (2) {
              ["attributes":protected]=>
              array(5) {
                ["startLine"]=>
                int(4)
                ["comments"]=>
                array(1) {
                  [0]=>
                  object(PhpParser\Comment)#643 (7) {
                    ["text":protected]=>
                    string(17) "// quotation mark"
                    ["startLine":protected]=>
                    int(3)
                    ["startFilePos":protected]=>
                    int(33)
                    ["startTokenPos":protected]=>
                    int(14)
                    ["endLine":protected]=>
                    int(3)
                    ["endFilePos":protected]=>
                    int(49)
                    ["endTokenPos":protected]=>
                    int(14)
                  }
                }
                ["endLine"]=>
                int(4)
                ["rawValue"]=>
                string(2) "38"
                ["kind"]=>
                int(10)
              }
              ["value"]=>
              int(38)
            }
            ["value"]=>
            object(PhpParser\Node\Scalar\String_)#645 (2) {
              ["attributes":protected]=>
              array(4) {
                ["startLine"]=>
                int(4)
                ["endLine"]=>
                int(4)
                ["kind"]=>
                int(1)
                ["rawValue"]=>
                string(5) "'amp'"
              }
              ["value"]=>
              string(3) "amp"
            }
            ["byRef"]=>
            bool(false)
            ["unpack"]=>
            bool(false)
          }
          [2]=>
          object(PhpParser\Node\Expr\ArrayItem)#650 (5) {
            ["attributes":protected]=>
            array(3) {
              ["startLine"]=>
              int(5)
              ["comments"]=>
              array(1) {
                [0]=>
                object(PhpParser\Comment)#647 (7) {
                  ["text":protected]=>
                  string(12) "// ampersand"
                  ["startLine":protected]=>
                  int(4)
                  ["startFilePos":protected]=>
                  int(68)
                  ["startTokenPos":protected]=>
                  int(23)
                  ["endLine":protected]=>
                  int(4)
                  ["endFilePos":protected]=>
                  int(79)
                  ["endTokenPos":protected]=>
                  int(23)
                }
              }
              ["endLine"]=>
              int(5)
            }
            ["key"]=>
            object(PhpParser\Node\Scalar\LNumber)#648 (2) {
              ["attributes":protected]=>
              array(5) {
                ["startLine"]=>
                int(5)
                ["comments"]=>
                array(1) {
                  [0]=>
                  object(PhpParser\Comment)#647 (7) {
                    ["text":protected]=>
                    string(12) "// ampersand"
                    ["startLine":protected]=>
                    int(4)
                    ["startFilePos":protected]=>
                    int(68)
                    ["startTokenPos":protected]=>
                    int(23)
                    ["endLine":protected]=>
                    int(4)
                    ["endFilePos":protected]=>
                    int(79)
                    ["endTokenPos":protected]=>
                    int(23)
                  }
                }
                ["endLine"]=>
                int(5)
                ["rawValue"]=>
                string(2) "60"
                ["kind"]=>
                int(10)
              }
              ["value"]=>
              int(60)
            }
            ["value"]=>
            object(PhpParser\Node\Scalar\String_)#649 (2) {
              ["attributes":protected]=>
              array(4) {
                ["startLine"]=>
                int(5)
                ["endLine"]=>
                int(5)
                ["kind"]=>
                int(1)
                ["rawValue"]=>
                string(4) "'lt'"
              }
              ["value"]=>
              string(2) "lt"
            }
            ["byRef"]=>
            bool(false)
            ["unpack"]=>
            bool(false)
          }
          [3]=>
          object(PhpParser\Node\Expr\ArrayItem)#654 (5) {
            ["attributes":protected]=>
            array(3) {
              ["startLine"]=>
              int(6)
              ["comments"]=>
              array(1) {
                [0]=>
                object(PhpParser\Comment)#651 (7) {
                  ["text":protected]=>
                  string(17) "// less-than sign"
                  ["startLine":protected]=>
                  int(5)
                  ["startFilePos":protected]=>
                  int(97)
                  ["startTokenPos":protected]=>
                  int(32)
                  ["endLine":protected]=>
                  int(5)
                  ["endFilePos":protected]=>
                  int(113)
                  ["endTokenPos":protected]=>
                  int(32)
                }
              }
              ["endLine"]=>
              int(6)
            }
            ["key"]=>
            object(PhpParser\Node\Scalar\LNumber)#652 (2) {
              ["attributes":protected]=>
              array(5) {
                ["startLine"]=>
                int(6)
                ["comments"]=>
                array(1) {
                  [0]=>
                  object(PhpParser\Comment)#651 (7) {
                    ["text":protected]=>
                    string(17) "// less-than sign"
                    ["startLine":protected]=>
                    int(5)
                    ["startFilePos":protected]=>
                    int(97)
                    ["startTokenPos":protected]=>
                    int(32)
                    ["endLine":protected]=>
                    int(5)
                    ["endFilePos":protected]=>
                    int(113)
                    ["endTokenPos":protected]=>
                    int(32)
                  }
                }
                ["endLine"]=>
                int(6)
                ["rawValue"]=>
                string(2) "62"
                ["kind"]=>
                int(10)
              }
              ["value"]=>
              int(62)
            }
            ["value"]=>
            object(PhpParser\Node\Scalar\String_)#653 (2) {
              ["attributes":protected]=>
              array(4) {
                ["startLine"]=>
                int(6)
                ["endLine"]=>
                int(6)
                ["kind"]=>
                int(1)
                ["rawValue"]=>
                string(4) "'gt'"
              }
              ["value"]=>
              string(2) "gt"
            }
            ["byRef"]=>
            bool(false)
            ["unpack"]=>
            bool(false)
          }
        }
      }
    }
  }
}
<?php
$map = [
    34 => 'quot', // quotation mark
    38 => 'amp', // ampersand
    60 => 'lt', // less-than sign
    62 => 'gt',
];

@RobertMe
Copy link

To add another case to this:

function foo(/* retained */ $param) /* not retained */
{
}

Cause is what @nikic wrote above: there is nothing to attach the comment to.

Was investing this as at some point at my company we decided to (also) write return types (and parameter types) like this as a comment at a time we hadn't updated to a PHP version which supported type hints yet. This with the idea to later just drop the /* + */. As this didn't happen (yet), and now with tools like Rector we don't really need them anymore (as Rector can automatically determine the types) / we would still need to go over them one by one to manually check them (if Rector can't / doesn't insert them because of uncertainty). So I was looking into automatically dropping those comments using some Rector rule based on the condition that the function / method has an "official" return type (or param has an "official" type set). But then found out these comments aren't retained in the AST for functions / methods. Did a bit of debugging and found this to be the obvious cause (nothing to really attach them too, as in the lexer / parser they're attached to the { token which doesn't represent a node). In this specific case they might be attached to the function / method node I guess (as comments also is an array that would be possible without overwriting something like PHPDoc), but that may break existing tools and would be a very specific case I presume. And it would also be hard to determine if a comment belongs to for example an upcoming node, or the previous node (in this case it might be attached "backwards" to the function / method node, in the example from the issue description itself with a switch it might be attached to the switch node? to previous node as well, but under normal circumstances it obviously needs to be attached to an upcoming node as is always done currently).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants