Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent calling twig_get_attribute by generating the guessed PHP code instead #3928

Open
wants to merge 8 commits into
base: 3.x
Choose a base branch
from

Conversation

JoshuaBehrens
Copy link

Once upon a time I had to find out why a twig template is slow (350ms). In my SPX icicle graph I saw twig_get_attribute being used quite often and is taking some time:

11,5% of exclusive time of twig_get_attribute

At this time I started to dig into what this method does. If I get it right it guesses whatever how to get to the next value by a given attribute as it tries to get one step deeper from the dot-path notation we use everywhere. I see that this is needed for such a dynamically typed language that is the base for this.

A template like this {{ foo.bar }} could mean:

  1. $foo->bar
  2. $foo->bar()
  3. $foo->getBar()
  4. $foo->isBar()
  5. $foo['bar']

I thought it could be at least cached so the choice does not need to be calculated twice but this is only valid if you have the values to render the template. So I have to use something else to improve the performance. Now imagine all the paths one could take for {{ foo.bar.thing }}. This can be a lot but if predict the path to be $foo->getBar()->getThing() because you just pass in your structs with getters and you can know it ahead.

For my IDE to support me in programming I use Twig comments with PHP type hints. I thought if this helps me to predict what methods I can call, Twig could use this info as well. PHPStan and Psalm also help me in my daily work as well so maybe we can add some sort of compiler optimizations like we know from C/++:

  • Level 0 acts like this before
  • Level 1 is preferring predict access but falls back to (generates more code but slow code should be executed less)
  • Level 2 is only using predicted access (less code for clear paths and only falling back to null)
  • Level 3 is throwing exceptions when paths are not resolvable (code like in level 2 but templates can be validated at compile time).

What I implemented in this pull request is a proof of concept with basic variable type cache for different scopes. These scopes can be influenced by code comments {# @var type \DateTimeInterface #}, that you already do, and with a specific block {% type "DateTimeInterface" %} to be more specificly integrated into the language so one could differ between IDE hints and language specific hints. I already have scopes by BodyNode and WithNode. Types that can be extracted from simple expressions are understood without type hints (e.g. array expressions). I prepared code, that wants to behave like level 2 optimization but level 1 could be done, when changing src/Node/Expression/GetAttrExpression.php:51 by removing the true || and using a createAutoInlineSourceCompiler instead.

Open todo / questions:

  • I used match expression to ensure I have macro and other variables at hand without having to use them in closures, that would be better for lower PHP versions. Can we just use it and expect PHP 8 or do we have to use closures and keep track which vars to use?
  • Where to integrate Security/Sandboxing features and checks? I do not know all the caveats and what do look for when testing
  • How would one add support for custom method calls for people doing __call and __get magic?
  • I do not like the TypeFactory but I do not see DI usage in structs. Any suggestions how to change that to fit into the project?
  • Can the expression parser do parsing of PHP Types (union and intersection)? Do we want to support intersection types?
  • Capturing scalar values. I do this but is there any use?
  • Noting down types feels like it should be in the compiler, but the visitor just has env. So I keep track of variables in the env.
  • Type deduplication is still needed ClassA|ClassA generates duplicate code
  • Pseudo variables like loop have type info
  • Expression analysis of filters and more function and method calls
  • Benchmarking cache building
  • Benchmarking loading bigger cache (level 1)
  • Benchmarking rendering (level 2)

Thank you for @jkrzefski brainstorming the optimization levels
Thank you for reading and looking into this.

@stof
Copy link
Member

stof commented Dec 5, 2023

I would say that level 2+ should not exist as they break templates (and falling back to null will break things silently).

And the big question is whether level 1 can be implemented without any behavior changes for template authors:

  • the sandbox must still work
  • the nice error messages when using undefined stuff must still exist

@JoshuaBehrens
Copy link
Author

@stof thank you for your quick reaction to this ❤️ I am not yet aware of all your features you have as I am a Twig user, that is "quite far away" of using it directly. I just place Twig files in a folder and the rest happens. Is there more documentation regarding sandboxing than in the docs folder to read upon? And how do I get nice error messages? I might be spoiled by Symfony Twig Bundle features, that I likely cannot distinguish from Twig alone.

I must admit, that I am looking forward to be able to reach level 2 to have a smaller compiled Twig cache and level 3 to not break silently. From my point of view I use Twig because of the inheritance, overrides features and auto escaping. Otherwise I would use PHP instead as it is inherently a good templating engine. If I can get rid of the twig_get_attribute at all, because all the necessary information are given at compile time, I think we can achieve a better compile result overall. I see, that this does not fit every application. I've been faking objects with arrays with in includes and sets sometimes and this would fail. But this would be okay to me because then I would be okay with it to "slow" as a compromise. Not "slow by design".

Don't get me wrong: Twig is not that slow. It just happened, that I was able to make it slow likely with lots of files, blocks, nested objects and sets.

@JoshuaBehrens JoshuaBehrens force-pushed the feature/typed-expression-prediction branch from 6fa36e8 to bbb2b43 Compare March 24, 2024 01:02
@JoshuaBehrens JoshuaBehrens force-pushed the feature/typed-expression-prediction branch from bbb2b43 to c6c29b5 Compare March 24, 2024 01:15
@drjayvee
Copy link
Contributor

Interesting!

I've actually asked the maintainer of the Symfony Support plugin for PhpStorm to change the auto-complete to default to long form. My reason was to make it easier to find and refactor callers of class methods/properties. But I see that performance could also be improved if the Twig compiler bypassed twig_get_attribute and went straight to a method call or property access.

I'm planning for TwigStan (static code analysis for Twig templates) to warn about invalid code, but maybe it would be worth it to add a notice when code uses a short-hand.

@JoshuaBehrens
Copy link
Author

Hi @drjayvee thank you for reaching out. I see TwigStan to be a good place for it because I do not want to pull in phpstan capabilities into Twig core. But having it would make the code more performant. Having the exact notation for array and method call improves but is difficult to handle for third party templates therefore I try to improve the core instead of everybody else.

Fun thing I recently discussed with @keulinho and we were both confused why Haehnchen/idea-php-symfony2-plugin#2024 (comment) already worked although we were pretty sure it did not :D

@drjayvee
Copy link
Contributor

drjayvee commented Apr 3, 2024

Fun thing I recently discussed with @keulinho and we were both confused why Haehnchen/idea-php-symfony2-plugin#2024 (comment) already worked although we were pretty sure it did not :D

Actually, it turns out I was wrong! All the more reason for TwigStan to check whether properties/methods actually exist.

Unfortunately, it's not looking too good for PR 4009. :/

@JoshuaBehrens
Copy link
Author

Unfortunately, it's not looking too good for PR 4009. :/

Due to the CST vs AST discussion?

@drjayvee
Copy link
Contributor

drjayvee commented Apr 4, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants