Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proper support for flexible heredoc/nowdoc syntax #1274

Closed
czosel opened this issue Jan 3, 2020 · 27 comments
Closed

Proper support for flexible heredoc/nowdoc syntax #1274

czosel opened this issue Jan 3, 2020 · 27 comments

Comments

@czosel
Copy link
Collaborator

czosel commented Jan 3, 2020

Currently, the new flexible heredoc/nowdoc syntax is "destroyed" during formatting:

$php73FlexibleNowdoc = <<<'EOD'
      Example of string
      spanning multiple lines
      using nowdoc syntax.
      EOD;

is printed as

$php73FlexibleNowdoc = <<<'EOD'
Example of string
spanning multiple lines
using nowdoc syntax.
turns into

EOD;

Also, there seems to be an extra linebreak before the end.

Concerning expected behavior, I'd propose to leave the indentation the same way we find it. Thoughts?

cc @evilebottnawi @loilo

@loilo
Copy link
Collaborator

loilo commented Jan 3, 2020

First things first: I realized the indentation reset (as described above) currently only affects nowdoc blocks, not heredoc. We should definitely align their behaviors.

For the following, I'm going to assume that this has already been fixed and flexible nowdoc and heredoc actually work the same. I'm also only gonna use "heredoc" below, please read that as "heredoc/nowdoc".

This is a tricky one.

I'd say, it would be The Prettier Way™ (and also the prettier way 🙃) to figure out one way to print heredocs and to enforce that.

One process which sounds reasonable to me: If the starting tag goes on its own line, align the content with that. Otherwise indent it one tabWidth further.


Examples of the described rule

Example 1

The heredoc-opening tag goes on its own line.

Input:

some_function(<<<'EOT'
Hello World!
EOT
);

Current output:

some_function(
    <<<'EOT'
Hello World!
EOT
);

Envisioned output:

some_function(
    <<<'EOT'
    Hello World!
    EOT
);

Example 2

The heredoc-opening tag goes on the same line as the preceding code.

Input:

$string = <<<'EOT'
Hello World!
EOT;

Current output:

$string = <<<'EOT'
Hello World!
EOT;

Envisioned output:

$string = <<<'EOT'
    Hello World!
    EOT;

However, this is not doable consistently without an --engine flag because we obviously cannot assume PHP 7.4 for everyone.

We could apply this rule to all heredocs that are already indented (and thus signaling PHP 7.4 availability), but while this would certainly work, I'd probably go with the more consistent way until we implement an option for the engine. (It's high time for that anyway I guess, but who's got that time. ¯\_(ツ)_/¯)


Until then, I'd actually vote for keeping the current behavior. Resetting the heredoc indentation can be annoying, but it's the only way we can be consistent about this. Sure, we could just keep the indentation untouched, but Prettier is deliberately changing a lot of indentation around that already.

If we keep the user's indentation (relative to the opening tag), this will lead to ugly cases like this:

// Input:
some_function(<<<'EOT'
    Hello World!
    EOT
);

// Output:
some_function(
    <<<'EOT'
        Hello World!
        EOT
);

In my opinion, if we'd want to keep the user's indentation, we would need to figure out the user's most likely intention. I'd rather not open that Pandora's box.

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

I agree with all the points you're making! 👍
Since our development resources are limited, it might make most sense to actually implement an --engine flag now. In that context we should also change the way the trailingCommaPHP option works: I'd only keep the none and all options, and evaluate it together with the engine setting to decide which trailing commas actually need to be included (If one want's trailing commas, then probably wherever the current PHP version allows them).

cc @evilebottnawi

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

Agreed. I'll go ahead and create a needs engine label and attach that to some issues where we've discussed the necessity of such a flag to solve it properly.

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

👍

Another question: Is engine really a good name? I associate it more with the Zend engine, which is not directly related to the PHP version number. How about phpVersion? (Or PHPVersion?)

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

Was thinking about that too. I don't really associate it with Zend (mostly because I've never came in touch with that), but a name with more clarity should be preferred anyways.

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

runtime would be another alternative. It's not as extremly obvious as phpVersion, but it's concise yet unambiguous.

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

I also prefer "runtime" over "engine", but I guess "version" ist indeed the most obvious one. And I'd prepend "PHP" to avoid future collisions with prettier core or other plugins.

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

Collisions are a good point. I'd go with phpVersion then. (PHPVersion would be more correct but also is completely weird. 🤷‍♂)

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

Totally agree 👍

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

Another question: What should the default be? The oldest supported version, i.e. 7.2? Or should there even be a default?

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

I don't think there should be a default. There's no way we can make any reliable assumptions that people use a supported PHP version, and there's no reason to constrain the formatter to an old PHP version because nobody bothered to change the option's default.

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

That's true, but I'm not sure if it's really worth giving up our zero-config goodness over it. Let's make a quick overview:

Feature Available since
Flexible heredoc/nowdoc syntax 7.3
Array shorthand notation 5.4
Trailing commas in arrays and lists 5.0 (?)
Trailing commas in "use" 7.2
Trailing commas in function calls 7.3

I think for those few features it's almost a pity to force users to configure something before they can get productive. If one does the initial formatting with an incorrect setting, it's trivial to resolve that issue.

I'd propose to

  • check community best practices if the current default for trailingCommaPHP (none) is still correct
  • set the default version to 7.2 (or 7.1 if the trailingCommaPHP default is changed to all, such that we're not getting trailing commas for "use" by default)

This would allow to keep the plugin zero-config with optimal support from 5.4 to 7.1/7.2 and very minor drawbacks for 7.3 and beyond.

Another idea to work around this issue was to read the PHP version from composer.json, as suggested here.

Update: The PHP community adopted trailing commas in multiline arrays pretty consistently. It might be best to change the default for trailing commas to all, which would mean setting the default version to 7.1.

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

I thought about reading from composer.json as well, but I have no idea how that would look in practice (and whether or not any Prettier plugin does some similar environment detection already).


I'm pretty torn on the default thing. I definitely see the appeal of keeping the Prettier setup zero-config. So, what's the pros/cons here?

Pro default value:

  • Keep the setup config-less, in true Prettier fashion. This is a huge thing in the spirit of Prettier; also enforcing a configuration is a needless annoyance and sometimes even a clarity issue due to one more config file flying around the project root.

Contra default value:

  • A lot of projects (at least from my everyday work experience) are still running on end-of-life versions of PHP. Prettier would break their code. However, this problem is alleviated by the fact that the incompatible features would all cause upfront syntax errors (not runtime errors) and should therefore be detected immediately.
  • Projects running on the latest PHP versions (7.3 + 7.4) may miss out on useful features (I like flexible heredoc/nowdoc indentation a lot) due to the lack of awareness of the version flag. This is unfortunate, but not a deal breaker.

Honestly, I don't feel like I could wholeheartedly weigh in on either side. But I'm bad at making decisions anyway. @evilebottnawi, thoughts?


These are the things to be balanced based on opinion. There is also a very practical issue with the default version which I'm not sure how to solve:

Would we still use the latest parser version? If not, we'd need to implement options for a PHP 7.1/7.2 parser (as far as I can tell, only switches for 7.0, 7.3 and 7.4 exist). This might also annoy users who use PHP 7.3/7.4 features and get a parser error.

If we keep using the latest parser and settle for PHP 7.2 as the default, how would we handle encountering a PHP 7.3/7.4 feature? Would we magically upgrade the phpVersion? Would that lead to the formatter printing code before that feature in PHP 7.2 and code after that in another version? Or would we need to check for this upfront? How would we do that without a performance penalty? Detecting arrow functions would probably be straightforward due to their token, but any trailing commas would probably be really hard or impossible to detect.

@czosel
Copy link
Collaborator Author

czosel commented Jan 4, 2020

A lot of projects (at least from my everyday work experience) are still running on end-of-life versions of PHP. Prettier would break their code. However, this problem is alleviated by the fact that the incompatible features would all cause upfront syntax errors (not runtime errors) and should therefore be detected immediately.

I think people who use PHP <= 5.4 in production are used to running into issues these days. This analysis show that usage is insignificant. I guess putting a note in our README concerning version support would solve this, something like:

Version support
5.5 - 7.2: 😎 You don't need to configure anything! (Group 1)
7.3, 7.4: ✨ Specify the version to enable modern features in printer output (Group 2)
5.0 - 5.4: ⚠️ Specify the version to avoid incompatible output (Group 3)

According to the analysis, roughly two thirds of our users would be in group 1, one third in group 2, and almost no one in group 3. If one accounts for the fact that users of very old PHP version often don't use composer either and therefore wouldn't show up in the statistics, this might shift a little bit (*) - but in the future most users will move into groups 1 and 2.

(*) This analysis reports that roughly 15% of all Websites still use PHP 5.3 and 5.4 - but I think that the analysis based on composer installs is more relevant to us because we're catering PHP developers, and most of the Websites using old PHP versions are probably just Wordpress sites that were never updated 😉

I think we should build as little version-dependent stuff into our plugin as possible:

  • always use the latest parser
  • if we encounter a feature which is only allowed in versions which are more modern than the configured version, we print it according to the currently configured version - that way, one can use the plugin to migrate to an older PHP version (example: drop trailing commas, remove heredoc indentation). Exceptions to that rule would be features that go beyond basic formatting and can't be trivially printed differently such that they work in older PHP versions (e.g. arrow functions). We don't want to implement something similar to Babel for JS here.

Prettier for JS just prints modern features the same way they are found, and adapts formatting choices and option defaults very conservatively based on platform support - they're changing the default for trailing commas to the "es5" setting now (in Prettier 2.0)! prettier/prettier#68


Alternative idea

When I think more about the approach Prettier for JS is taking, there might be another alternative: If the end tag of a heredoc/nowdoc statement is indented, print it according to the "modern style" suggested above - otherwise print it in the compatible style we're using now. The obvious downside with this approach is that you'd manually have to indent your end tag to get optimal formatting - very similar to how you currently have to make the manual change from array() to shorthand array notation!

Examples

Example 1

The indented end tag signals PHP 7.3 or above.

Input:

some_function(<<<'EOT'
    Hello World!
    EOT
);

Current output:

some_function(
    <<<'EOT'
Hello World!
EOT
);

Envisioned output:

some_function(
    <<<'EOT'
    Hello World!
    EOT
);

Example 2

The unindented end tag signals a version which is possibly lower than 7.3:

Input:

$string = <<<'EOT'
Hello World!
EOT;

Output:

$string = <<<'EOT'
Hello World!
EOT;

I'm currently a bit torn on the whole story - maybe best to sleep on it. Might also an idea to boil the whole argument down a bit and then put it up for a vote (in a separate issue)?

@loilo
Copy link
Collaborator

loilo commented Jan 4, 2020

The points you made are pretty convincing, mostly because I didn't read this part thoroughly enough:

set the default version to 7.2 (or 7.1 if the trailingCommaPHP default is changed to all, such that we're not getting trailing commas for "use" by default)

Which is to say I didn't realize that this really means that every version from as early as 5.5 up until as late as 7.2 is unproblematic. This would definitely be good enough for me. 👍

That said: Isn't it even from 5.4 to 7.2? PHP 5.5 has no new syntactic novelties (which we use), has it?

[…] I think that the analysis based on composer installs is more relevant to us […]

Absolutely. I'd assume that you only add Prettier to a maintained project, and maintained projects (especially maintained to the degree that someone adds a new formatter) running on PHP 5.4 should be the absolute exception.


I also overestimated the problems when features newer than the default are encountered. I somehow supposed that we would be required to "upgrade" the formatter at that point — and didn't consider we could just format those features and then continue with the default version. (Which is to say that I like your "alternative idea" most. It's not optimal, but as I outlined above, the optimal solution seems to not be realizable.)


maybe best to sleep on it

Yeah, let's do that — but I think your suggestion is very viable.

@czosel
Copy link
Collaborator Author

czosel commented Jan 5, 2020

That said: Isn't it even from 5.4 to 7.2? PHP 5.5 has no new syntactic novelties (which we use), has it?

You're absolutely right!

After sleeping on it, I think we have basically have two ways to proceed:

Option 1: Add phpVersion option

  • introduce new phpVersion option
    • default: "7.1" (but works all the way down to 5.4)
  • remove all option values of trailingCommaPHP except "none" and "all", change the default to "all" (which checks the phpVersion setting to determine which trailing commas can be printed)
  • print array(...) as shorthand [...] except when phpVersion < 5.4
  • print heredoc/nowdoc in "classic" (non-flexible) style except when phpVersion >= 7.3

In short, the phpVersion setting would be somewhat similar to a "config preset", i.e. by specifying one option we can make a couple of specific formatting decisions for the user. We still don't loose our zero-config goodness because the default works perfectly for 5.4 - 7.2 and still quite well for 7.3 and beyond.

Option 2: "Guess" intended version based on state of source code

This is basically what we're doing right now, and what prettier for JS is doing.

  • trailingCommaPHP options remain unchanged, but we change the default to php5
  • print array(...) as shorthand [...] and lift our minimum supported version to 5.4 (or add an option if this causes actual problems for users)
  • print heredoc/nowdoc in flexible style when we find the end tag to be indented

I'm leaning towards option 1:

  • very reasonable behavior with zero configuration and optimal behavior with specifying just one option (phpVersion)
  • would help us to keep the total amount of options small in the future
  • allows keeping the printing "explicit", i.e. mostly dependent on the input AST and not the input formatting

cc @evilebottnawi

@loilo
Copy link
Collaborator

loilo commented Jan 5, 2020

I agree with option 1. Option 2 is not too far from what we have right now. It's not bad, but IMO it keeps us too far away from the results we can achieve with 1.

Prettier for JS may go well with guessing, but I think it would be hard for us to adopt that procedure one-to-one — not least of all because JS never gets breaking changes. The ones in PHP major versions are incredibly rare, but they exist, and if somebody runs into them because of Prettier, this may be very annoying to deal with.

@czosel
Copy link
Collaborator Author

czosel commented Jan 5, 2020

It's not bad, but IMO it keeps us too far away from the results we can achieve with 1.

Totally agree! 👍

Also, in JS a version option would much harder to define because people like to transpile modern language features using Babel, sometimes even on a feature-by-feature basis.

I'll go ahead and create a new issue to outline the upcoming changes and link to it in all the issues that are affected by it, such that people can still give feedback.

@alexander-akait
Copy link
Member

I am sorry, I am very busy right now, do not have time on this project, as soon as I deal with everything, I will return and continue to help so feel free to improve project 😄

@czosel
Copy link
Collaborator Author

czosel commented Jan 6, 2020

@evilebottnawi 😅 No worries, wishing you success with your tasks and looking forward to continue working with you on this in the future!

@karptonite
Copy link
Contributor

Now that the phpVersion flag is implemented, I'd love to see real support for flexible heredoc/nowdoc. In line with the Prettier philosophy, choosing one version (probably indented either one level deeper or to the same level as the line above) makes sense to me. Any thoughts? I see from Issue #1291 that @czosel had a go at this and had trouble figuring out the indent level. :(

@mwchambers
Copy link

mwchambers commented Oct 18, 2020

Hi @czosel and anyone else who can help,

I can see that various commits have been merged relating to the flexible HEREDOC/NOWDOC syntax.

e.g. #1287 , #1291, #1287

The indentation before the end identifier is preserved when using the NOWDOC syntax, but not as I would have hoped for HEREDOC.

I wonder if you can advice whether this is expected behaviour?

Here's a playground example:

NOWDOC/HEREDOC indentation example

I was expecting the HEREDOC to be indented like it is for the NOWDOC when using --php-version=7.4

Thanks

@czosel
Copy link
Collaborator Author

czosel commented Oct 24, 2020

Thanks for bringing this up @mwchambers. I just took a look, and making heredoc behave exacly the same way as nowdoc was a simple change: #1574

@mwchambers
Copy link

@czosel Thank you, that is working perfectly now with that change applied. 👍

@czosel
Copy link
Collaborator Author

czosel commented Oct 26, 2020

@mwchambers Great, thanks for testing! We'll make a new release once #1576 has landed.

@czosel
Copy link
Collaborator Author

czosel commented Oct 26, 2020

Released as v0.15.2 🎉

@czosel
Copy link
Collaborator Author

czosel commented Nov 14, 2020

I'll close this in favor of #1590 to make it more clear what can still be improved.

@czosel czosel closed this as completed Nov 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants