Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize specified types before intersection #1016

Conversation

herndlm
Copy link
Contributor

@herndlm herndlm commented Feb 13, 2022

Closes phpstan/phpstan#6329

This PR is improving type specification with combined BooleanAnd and BooleanIf expressions in 2 ways. Therefore the title does not match anymore :) I think something like "Improve type specification for combined && and || expressions" would fit better.

  1. It normalizes the SpecifiedTypes by removing the sureNotTypes from the relevant sureTypes which results in e.g. AccessoryNonEmptyStringType or NonEmptyArrayType. Without this, sureNotTypes might not be considered when intersecting instances of SpecifiedType, which leads to type information loss.
Details about the normalization

In order to narrow down the variable $a from mixed to non-empty-string|null the following code could (it's maybe a bit odd but this is not relevant) be used in an if (truthy context):

is_string($a) && '' !== $a || null === $a

This can be described with the expression

new Expr\BinaryOp\BooleanOr(
	new Expr\BinaryOp\BooleanAnd(
		new FuncCall(new Name('is_string'), [new Arg(new Variable('a'))])
		new NotIdentical(new String_(''), new Variable('a')),
	),
	new Identical(new Expr\ConstFetch(new Name('null')), new Variable('a')),
)

The interesting parts are the inner BooleanAnd and the outer BooleanOr. If the TypeSpecifier looks at this it will basically recursively determine the SpecifiedTypes. Without going into too much details, that the relevant people know better than me anyway, the following will be the simplified relevant sureTypes and sureNotTypes for $a of the relevant SpecifiedTypes instances:

BooleanAnd

left expression: StringType sureType
right expression: ConstantStringType with value '' sureNotType

Result

BooleanAnd uses SpecifiedTypes->unionWith for a truthy context like here which results in a new SpecifiedTypes having both of those

BooleanOr

left expression: the above union => StringType sureType, ConstantStringType with value '' sureNotType
right expression: NullType sureType

Result

=> BooleanOr uses SpecifiedTypes->intersectWith

Before

intersectWith only looks at the sureTypes and sureNotTypes of both SpecifiedTypes instances if their expression string occurs also in both. Meaning, it will union (for a truthy context like here) the StringType and NullType because both SpecifiedTypes instances have sureTypes for $a. But, it will not use the sureNotType if there is not a sureNotType in the other SpecifiedTypes for $a. Therefore it results in a SpecifiedTypes instance with a StringType|NullType sureTypes union only and basically looses the sureNotType information that is relevant for the left expression.

After

both SpecifiedTypes instances are normalized which "moves" the sureNotType into the sureType resulting in a StringType&AccessoryNonEmptyStringType intersection for the left expression basically. The other SpecifiedTypes instance is not affected here by normalization. The intersectWith logic then unionizes (for a truthy context like here) both instances resulting in a (StringType&AccessoryNonEmptyStringType)|NullType union sureType and therfore does not loose the sureNotType information which is coupled to the left expression but does not make any sense for the right one.

Without the BooleanOr the sureNotType is not lost and the MutatingScope can use it to determine that the outcome is non-empty-string. With BooleanOr we have to normalize this already here IMO before intersecting in order to not loose the information.

  1. It makes use of the MutatingScope in BooleanAnd and BooleanOr expressions to filter by the specified types of the left espression and uses the resulting mutated scope in the right expression. This ensures that information gathered on the left side can be used on the right side. See Normalize specified types before intersection #1016 (comment)

src/Analyser/TypeSpecifier.php Outdated Show resolved Hide resolved
src/Analyser/TypeSpecifier.php Outdated Show resolved Hide resolved
@herndlm herndlm force-pushed the normalize-specified-types-before-intersection branch from 1dce3de to 67282d2 Compare February 13, 2022 20:47
@herndlm herndlm force-pushed the normalize-specified-types-before-intersection branch from 67282d2 to 949e5fa Compare February 13, 2022 20:47
@herndlm
Copy link
Contributor Author

herndlm commented Feb 14, 2022

The failing tests are either unrelated or, in case of webmozart-assert, actual fixes

Marking this as ready, but I have the feeling the 2 typespecifier workarounds should be reverted for now, just let me know

@herndlm herndlm marked this pull request as ready for review February 14, 2022 06:55
@ondrejmirtes
Copy link
Member

Hi, thank you for the research. Can you please describe in more detail how does SpecifiedTypes look before and after the normalization and what does it achieve further down the line? Thanks.

@herndlm
Copy link
Contributor Author

herndlm commented Feb 14, 2022

@ondrejmirtes I wrote a short novel hidden behind the Details section in the description. I hope this explains everything.

@ondrejmirtes
Copy link
Member

Perfect explanation, I now understand the problem :) I'm gonna let it sit for a while to think about it, maybe I'll come up even with a better solution. What I'm worried about a bit is a falsy context where the meaning of sureTypes/sureNotTypes is switched and it should be doing the opposite I think. Anyway, it's now my turn to do something about it, thank you.

@herndlm
Copy link
Contributor Author

herndlm commented Feb 15, 2022

I added another commit and with that one I managed to

It makes use of the MutatingScope when handling BooleanAnd and BooleanOr by filtering with the left expression. And that mutated scope is then only used for the right expression. I think this was needed because the left expression could narrow down a mixed variable via e.g. is_string to, well, string. And that information is currently not used in the right expression which can make problems with some type extensions that ask the scope via getType and use that information internally.

I think this change makes PHPStan understand some complex true and false context conditionals that it did not understand before 🎉 At least if I'm not completely mistaken. But I'll think about it a bit more. Maybe it also makes sense to add more complex test cases somewhere

@ondrejmirtes
Copy link
Member

Are any of the new test cases and bugfixes about my concern?

What I'm worried about a bit is a falsy context where the meaning of sureTypes/sureNotTypes is switched and it should be doing the opposite I think.

@herndlm
Copy link
Contributor Author

herndlm commented Feb 15, 2022

Are any of the new test cases and bugfixes about my concern?

There were already test cases that check the type in the else statement and I added some more. And AFAIK and could see via debugger, it uses a falsy context then to create the specified types and filter by them in the scope. So, I think this should be covered :)

@ondrejmirtes
Copy link
Member

Alright, I'm gonna merge it :) Please send a fix to webmozart-assert extension afterwards, thanks.

@ondrejmirtes ondrejmirtes merged commit f255672 into phpstan:master Feb 15, 2022
@ondrejmirtes
Copy link
Member

Thank you!

@herndlm
Copy link
Contributor Author

herndlm commented Feb 15, 2022

Finally I can sleep tonight xD But seriously, this kept me thinking, busy and step debugging quite a few hours

Alright, I'm gonna merge it :) Please send a fix to webmozart-assert extension afterwards, thanks.

In order to do that I need a new phpstan release to specify as minimum supported version, I think. Shall I comment-out the two problematic lines for now and add them back in correctly after a new phpstan version was tagged?

@ondrejmirtes
Copy link
Member

Also, feel free to continue thinking about it and have it in the back of your head, it's often that people come up with improvements afterwards and I love many little PRs :) Generics were implemented over dozens of small PRs :)

@herndlm herndlm deleted the normalize-specified-types-before-intersection branch February 15, 2022 19:28
@staabm
Copy link
Contributor

staabm commented Feb 15, 2022

great job @herndlm !

@ondrejmirtes
Copy link
Member

I need a new phpstan release to specify as minimum supported version

You don't - put ^1.4.7 or whatever is the next one to composer.json and the current dev-master will get installed. It's not going to have to change after the release and the stable version will be used automatically 😊

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants