Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added test to detect problems and inconsistencies in CallMap files #6247

Merged
merged 13 commits into from
Aug 8, 2021

Conversation

bitwise-operators
Copy link
Contributor

@bitwise-operators bitwise-operators commented Aug 5, 2021

I came across several functions where the 8.0 changes had been added to the Delta file, but not the main CallMap.
Since it was a significant number, I wrote a simple test to detect those conflicts, as well as several other possible errors with the CallMap files.

Of course, after writing the test, I fixed the errors it found, and in doing so, found several other things that were missing or faulty, mostly in the 8.0 changes:

  • A lot of the changes to the multibyte string, xml_parser and bcmath functions were missing
  • Added several functions that stopped being falsey and started throwing Exceptions in 8.0 (there are probably more of these, as they are badly documented)
  • Pecl sodium functions haven't been available since sodium became core in 7.2
  • Removed all but one of the date_* functions from the 8.0 delta. All were marked as having lost the option to return false, despite this only being true for date_format, as far as I can see.

Testdox for the new test:

Call Map (Psalm\Tests\Internal\CallMap)
[x] Dictionary path must be a readable directory
[x] Delta files contain added changed and removed sections
[x] Main callmap file exists
[x] Main callmap file contains a callmap
[x] Callmap keys are strings and values are signatures
[x] Functions can not be in more than one section
[x] Signature keys are zero or string and values are types
[x] Types are parsable
[x] Changed and removed functions must exist
[x] Functions removed in delta files are absent from main callmap
[x] Existing functions can not be added
[x] Functions added in delta files are present in main callmap
[x] Outgoing signatures must match most recent incoming signatures
[x] Main callmap signatures must match most recent incoming signatures

- Fixed inconsistencies in callmap delta files (mainly 8.0)
- Added several missing changes in the bc_* functions
- Added several missing changes in the mb_* functions
- Added several missing changes in the xml_parser_* functions
- Fixed several issues with sodium_ functions
- Removed all but one of the date_* functions. All were marked as having lost the option to return false, despite this only being true for date_format
@orklah
Copy link
Collaborator

orklah commented Aug 5, 2021

It's totally optional but something that could be great is to ensure that parameter names are consistent from PHP 8.0 going backward (it's hard to hard code this rule after PHP 8.0 in case of BC breaks). We know that PHP 8.0 is the only version where parameter names matter, and they were not great before so we may as well keep consistency

@bitwise-operators
Copy link
Contributor Author

You mean that the 7.x deltas are updated to reflect the names as they are in 8.0? I've looked at that. In some cases, I changed the old names to the new ones, but since it doesn't really matter, I didn't waste too much time on it. Most of the names actually reflect the names as they were in the PHP docs at the time (I checked in their Git history), so there is something to say for historical accuracy :-P

The problem is that it really only applies to functions that have been changed multiple times during the period 7.1 => 8.0, as otherwise, it's already covered by the testMainCallmapSignaturesMustMatchMostRecentIncomingSignatures() test.

Like you said, enforcing for versions from 8.0 onward would be problematic if, in the future, parameter names change,(which we would have to support)

@orklah
Copy link
Collaborator

orklah commented Aug 5, 2021

That's what I meant yeah, but you're right, it's nitpicking at this point

@bitwise-operators
Copy link
Contributor Author

Like I said, I looked into it, and changing the names wouldn't be too much work if that's something we want.

Something completely different: I'm looking at that ci/circleci test that's failing. If I'm not mistaken, that's a false negative.

the bcmath functions officially only accept strings as inputs, not numerics:
https://psalm.dev/r/81b102f318

But with PHP's type juggling, and the fact that strict-types is still not in common use, I can understand if we want to allow numerics here as well. But it would be technically wrong.

@psalm-github-bot
Copy link

I found these snippets:

https://psalm.dev/r/81b102f318
<?php
declare(strict_types=1);
echo bcdiv(6, '3');
Psalm output (using commit 019b358):

ERROR: InvalidScalarArgument - 3:12 - Argument 1 of bcdiv expects numeric-string, 6 provided

@orklah
Copy link
Collaborator

orklah commented Aug 5, 2021

About the numeric, yeah, I already made a PR in that repo not so long ago to change the numeric into int|float somewhere else so I think we can cast this one aside.

However, I'm more worried about another error in circleci:

ERROR: UndefinedFunction - ../../src/Psl/Math/base_convert.php:59:27 - Function bcadd does not exist (see https://psalm.dev/021)
$result_decimal = bcadd($result_decimal, bcmul((string)$digit_numeric, $place_value));

@weirdan
Copy link
Collaborator

weirdan commented Aug 5, 2021

Yeah, bcadd being missing looks wrong.

@bitwise-operators
Copy link
Contributor Author

I found the cause. Looks like I added the bcadd changes to the 'new' section of the 8.0 delta, but not the 'old' section.

So php 7.4 suddenly doesn't have bcadd.
I'll go and fix that...

# Conflicts:
#	dictionaries/CallMap_80_delta.php
@bitwise-operators
Copy link
Contributor Author

It does, however, raise the point that this is an error the test doesn't yet catch, which it certainly should.

I see two options: add a CallMap_00_delta.php file, which contains all the pre-7.1 signatures, as a baseline. Psalm itself will never use it, since InternalCallMapHandler caps delta-parsing at 71 through the LOWEST_AVAILABLE_DELTA constant.
But the test will find it, and automatically prevent this kind of error.

The second option would be to integrate that baseline array into the test.

I strongly prefer the first option, as the second would require changing the test if one of the old signatures turns out to be incorrect.

Opinions?

@orklah
Copy link
Collaborator

orklah commented Aug 5, 2021

I agree that the first option seems better. However, I wouldn't want to afraid new contributors with too much things in here (callmap changes is often an entry point for new contributors). So as long as the filename is clear on what the file does (and maybe some documentation on top too), I'm ok with it

@weirdan
Copy link
Collaborator

weirdan commented Aug 5, 2021

I think this can also be handled in a more developer-friendly way, at the cost of some duplication. E.g. if we had, in addition to the new and old lists that we already have, another two list<string> entries that would list added and removed functions/methods.

@bitwise-operators
Copy link
Contributor Author

But wouldn't that just compound to the problem? A dev would need to update even more arrays, increasing the chances for errors.

And we'd still need a test to check if both arrays are correct, so a baseline list would still be necessary.

Or am I misunderstanding you?

To prevent similar errors, add baseline file with signatures pre-7.1, to help verify delta files are correct.
Modify test to use this to detect missing 'old' entries
@bitwise-operators
Copy link
Contributor Author

I've generated a 00_delta file, and modified the test slightly so it will detect errors of the type I made. (adding a modified function to 'new' without adding it to 'old')

More tomorrow.

@weirdan
Copy link
Collaborator

weirdan commented Aug 5, 2021

I would say it's fairly low effort to add one entry for added or removed function and the chance to make mistakes in two places is pretty slim, as long as discrepancies are checked by an automated test (which would be easy to add). Also it's easier to catch issues during code review, and it's easy to cross-check deltas with PHP changelogs this way.

It's not bulletproof, but this way seems more manageable to me.

@bitwise-operators
Copy link
Contributor Author

Can you give an example of how you imagine this?

Would it essentially be two extra array elements with the contents of array_keys($old) and array_keys($new) ?

I've updated the testdox in the PR description, note the two new lines:

[x] Existing functions present in new section must be present in old
[x] Functions removed twice in delta files must be added first

Wouldn't that tell a dev all he needs to know?

@weirdan
Copy link
Collaborator

weirdan commented Aug 6, 2021

I had something like this in mind:

<?php
return [
  'added' => [
     'ZipArchive::replaceFile',
  ],
  'removed' => [
     'image2wbmp',
  ],
  'old' => [
     'image2wbmp' => ['bool', 'image' => 'resource', 'filename=' => 'string', 'foreground=' => 'int'],
     'imagecolorat' => ['int|false', 'image' => 'resource', 'x' => 'int', 'y' => 'int'],
  ],
  'new' => [
     'ZipArchive::replaceFile' => ['bool', 'filepath' => 'string', 'index' => 'string', 'start=' => 'int', 'length=' => 'int', 'flags=' => 'int'],
     'imagecolorat' => ['int|false', 'image' => 'GdImage', 'x' => 'int', 'y' => 'int'],
  ]
];

but now looking at it I think this is still suboptimal.

What if we instead had it like this:

<?php
return [
  'added' => [
     'ZipArchive::replaceFile' => ['bool', 'filepath' => 'string', 'index' => 'string', 'start=' => 'int', 'length=' => 'int', 'flags=' => 'int'],
  ],
  'changed' => [
     'imagecolorat' => [
         'old' => ['int|false', 'image' => 'resource', 'x' => 'int', 'y' => 'int'],
         'new' => ['int|false', 'image' => 'GdImage', 'x' => 'int', 'y' => 'int'],
     ],
  ],
  'removed' => [
     'image2wbmp' => ['bool', 'image' => 'resource', 'filename=' => 'string', 'foreground=' => 'int'],
  ],
];

It would make additions and removals explicit, and changes easier to add and review, as you would have old and new signatures next to each other.

It would also prevent the error you had, as 'changed' entries would have a format different to those in 'added'. And all the checks you added that did not require delta_00 would still be possible to implement.

@bitwise-operators
Copy link
Contributor Author

I agree that the second option would be a nice improvement. It is very clear (even to new people) what is going on, and leaves very little room for error.

I'll have a look at implementing it over the weekend.

@bitwise-operators bitwise-operators marked this pull request as draft August 6, 2021 12:00
bjscharp added 2 commits August 8, 2021 10:39
Convert deltafile format to new style proposed by weirdan
Modify CallMapTest to use new format
Modify InternalCallMapHandler to use new format
Move assertions to base testcase
@bitwise-operators bitwise-operators marked this pull request as ready for review August 8, 2021 09:00
@bitwise-operators
Copy link
Contributor Author

Ok, I've implemented the following changes:

  • Added a simple test to verify the output of InternalCallMapHandler::getCallMap() (mostly as a sanity check for myself to make sure I didn't screw anything up).
  • To facilitate that, I moved the assertions I originally added to the CallMapTest over to the root TestCase.
  • Modified the CallMapTest I wrote earlier to deal with the new format proposed by @weirdan . I added some extra checks to make sure changed functions are actually in the changed section and similar things
  • Converted the delta files to the new format
  • Modified the InternalCallMapHandler::getCallMap() method to read the new delta file format

Right now, it passes all tests but one, and that's a just a warning for an unneeded annotation in the PSL code. Since I didn't even change the hash_password() function, I don't think it's a show-stopper.

Copy link
Collaborator

@weirdan weirdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we were to drop delta_00, what assurances we'd lose?

dictionaries/CallMap_80_delta.php Outdated Show resolved Hide resolved
@bitwise-operators
Copy link
Contributor Author

I fixed the two docblock issues you mentioned.

If we were to drop delta_00, what assurances we'd lose?

Two types of error:

  • The ability to detect errors similar to the one I made earlier: A changed function being placed in the 'added' section, instead of the 'changed' section. Since this would lead to the function being absent from all versions older than the delta containing the error, I feel it's pretty important to check for it.
  • The ability to verify if the 'removed' entry or the 'old' section for a change to a long-existing function is correct. For example: 8.0 changed a lot of old, frequently used functions like htmlentities() that haven't been changed during the 7.x lifecycle. If the changed.old entry for that change is incorrect, all 7.x versions would use an incorrect signature. The same goes for a removed function like create_function(). An error there would lead to a wrong version being used in all 7.x verions.

The 00 delta helps detect those errors. The only way to introduce one of these errors, would be to ALSO edit the 00 delta, which would be easier to spot in a PR, as it is clearly marked as generally not needing modification.

Changed the format of the historical callmap to be plain signature list,
as it isn't really a delta file and renamed the file to not be confused
with deltas.

Added a test to ensure entries in main callmap have history.
@weirdan weirdan force-pushed the callmap-delta-inconsistencies branch from 45029fd to 8c531fe Compare August 8, 2021 15:43
Copy link
Collaborator

@weirdan weirdan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as tests are concerned I think we're good. @bitwise-operators did you use some script to convert deltas to the new format?

@bitwise-operators
Copy link
Contributor Author

@bitwise-operators did you use some script to convert deltas to the new format?

Yeah, but nothing special. A couble of array_diff_key and array_intersect_key calls and some code to write it back to the file.

@weirdan weirdan merged commit fdfed11 into vimeo:master Aug 8, 2021
@weirdan
Copy link
Collaborator

weirdan commented Aug 8, 2021

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants