Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve constant string union handling for concat and encapsed string #2057

Merged
merged 1 commit into from Dec 19, 2022

Conversation

schlndh
Copy link
Contributor

@schlndh schlndh commented Dec 6, 2022

The current behavior of concat and encapsed string is inconsistent when it comes to constant string unions: https://phpstan.org/r/05b25994-3e9f-4509-9d42-cc936b6790ad

I wanted it to behave the same. And I would also like to be able to concat/encaps two unions together.

Motivation:
I have an extension that checks SQL queries. Most queries are just simple constant strings. But sometimes there are options to e.g. filter/order by different columns and such. So there are multiple possible SQLs. The changes introduced by this PR should make it a bit easier to write such queries in a way that allows them to be checked. E.g. I often use encapsed string instead of concat, and it can happen that there are two independent "dynamic" parts of the query (e.g. WHERE and ORDER BY, which is why I also want to be able to concat two unions).

foreach ($constantStrings as $constantString) {
if ($constantString->getValue() === '') {
$strings[] = $rightStringType;
if ($combinedConstantStringsCount > 0 && $combinedConstantStringsCount <= 16) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that there is also CALCULATE_SCALARS_LIMIT in this class, but it was not used here, so I kept the previous limit.

@ondrejmirtes
Copy link
Member

@schlndh I really like your extension! I think it can afford to be smarter than asking the database for the types, because database will not tell you:

SELECT name FROM users
WHERE name IS NOT NULL

That name can only be string and not string|null.

But I have a suggestion - how did you implement the SQL parser? I know from my colleague it's no easy feat. He's been working on it thoroughly for a long time and he's not even finished yet: https://github.com/SQLFTW/sqlftw

In any case, I think you should use an existing parser like that, or at very least, you should extract your parser to a separate library.

assertType("'.1.'|'.2.'", ".$float.");
assertType("'..'|'.1.'", ".$bool.");
assertType("'..'|'.a.'", ".$nullable.");
assertType("'.a.0.'|'.a.1.'|'.b.0.'|'.b.1.'", ".$str.$int.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please also provide an example where the logic goes over the limit and the type is simplified again? Thank you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@schlndh schlndh force-pushed the improve-constant-string-unions branch from dd092dd to 0e4880e Compare December 16, 2022 15:28
@schlndh
Copy link
Contributor Author

schlndh commented Dec 16, 2022

@ondrejmirtes Thanks.

it can afford to be smarter than asking the database for the types, because database will not tell you:

At some point yes, but unfortunately it's not yet smart enough to handle the example you've given.

But I have a suggestion - how did you implement the SQL parser?

I implemented my own recursive descent parser. OFC it does not cover everything, but it's enough to get started. And since my goal is to analyze queries written by people I don't think I have to worry much about supporting 100% of MariaDB syntax. Most queries will be written in a smaller subset of SQL.

Currently, it can partially parse SELECT, UPDATE, INSERT, DELETE and TRUNCATE TABLE queries. The parser handles almost everything that the project I work on uses (at least in queries which are known statically - definitely in the hundreds). The two exceptions are STRAIGHT_JOIN and variables, both of which are used by a single query in the project, so I didn't bother to implement those yet. Besides that there are also missing features that we don't use - e.g. INSERT ... RETURNING ..., ...

But I definitely agree that it's not easy to write a reasonably compatible parser from scratch. Especially for MariaDB which has a seriously lacking documentation.

https://github.com/SQLFTW/sqlftw

I randomly discovered it recently, but I haven't had the time to look at it yet. From a brief glance it seems that they focus on MySQL, whereas my focus is on MariaDB. So I suspect that I won't be able to use it.

In any case, I think you should use an existing parser like that

I did look for existing SQL parsers when I started working on it, but the only thing I found at the time was some old library that 1) didn't support more recent features, and 2) it had a terrible AST. If a suitable MariaDB parser becomes available then I would consider switching to it. But for now I have a parser that mostly covers my needs.

or at very least, you should extract your parser to a separate library.

I currently don't have any plans to do that. Yes, strictly speaking it would be more useful for the "open-source community". But realistically nobody is going to use it, or even find out about it. So it would just be more work for me.

@ondrejmirtes
Copy link
Member

@schlndh It'd be a pity to do this work and only support MariaDB in the end. The work you're doing would also be applicable to other DB engines. It should be possible to have this functionality in PHPStan one day :)

@schlndh
Copy link
Contributor Author

schlndh commented Dec 17, 2022

@ondrejmirtes For now it's still in very early stage. We'll see how it goes.

@ondrejmirtes ondrejmirtes merged commit d28c6ce into phpstan:1.9.x Dec 19, 2022
@ondrejmirtes
Copy link
Member

Thank you!

@schlndh schlndh deleted the improve-constant-string-unions branch December 20, 2022 07:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants