Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Methods for converting mixed values to specific data types #17295

Merged
merged 15 commits into from
Jan 9, 2024

Conversation

jozefgrencik
Copy link
Contributor

This PR introduces a set of methods for converting mixed values to string/int/bool data types, ensuring a type-safe approach to data manipulation. This utility is useful for safely narrowing down the data types.

Look at the discussion:
#17177 (comment)

https://en.wikipedia.org/wiki/IEEE_754
https://www.h-schmidt.net/FloatConverter/IEEE754.html https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER

This PR introduces a set of methods for converting mixed values to string/int/bool data types, ensuring a type-safe approach to data manipulation.
This utility is useful for safely narrowing down the data types.

cakephp#17177 (comment)

https://en.wikipedia.org/wiki/IEEE_754
https://www.h-schmidt.net/FloatConverter/IEEE754.html
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER
@jozefgrencik
Copy link
Contributor Author

  • The class name is derived from the native PHP function filter_var().
  • It hasn't been tested how the functions will behave on 32-bit systems. Does CakePhp support 32-bit systems?

Notes on Unconventional Code:

Method: toString

bool -> string:

  • Handling of true is same as (string)true === '1'
  • Handling of false differs from standard PHP typecasting: (string)false === '', but toString(false) === '0' ('0' as opposed to '1').

float -> string:

  • Handling of NaN differs from standard PHP typecasting: (string)NaN === 'NAN', but toString(NaN) === null
  • Handling of (+/-)infinity differs from standard PHP typecasting: (string)INF === 'INF', but toString(INF) === null.
  • For other floats, attempts are made to convert via json_encode(), and if unsuccessful, then through sprintf(). For sprintf, PHP_FLOAT_DIG + 3 is used as the number of decimal places. Further experimentation may be needed to determine the optimal value.

object -> string:

  • For objects, is used similar approach like is in h() function in functions.php.

Method: toInt

float -> int:

Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'10:10 AM'
+'10:10 AM'
@dereuromark dereuromark added this to the 5.0.1 milestone Sep 20, 2023
@ADmad ADmad changed the base branch from 5.x to 5.next September 21, 2023 02:49
Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the direction this is headed. Nice work on the tests.

src/Utility/Filter.php Outdated Show resolved Hide resolved
src/Utility/Filter.php Outdated Show resolved Hide resolved
* Redistributions of files must retain the above copyright notice.
*
* @copyright Copyright (c) Cake Software Foundation, Inc. (https://cakefoundation.org)
* @since 5.0.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
* @since 5.0.1
* @since 5.1.0

Patch releases only contain bug fixes, new features need to go in the next minor release.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx. Fixed in 01ec52d

@dereuromark dereuromark modified the milestones: 5.0.1, 5.1.0 Sep 21, 2023
@@ -325,3 +326,119 @@ function deprecationWarning(string $version, string $message, int $stackFrame =
trigger_error($message, E_USER_DEPRECATED);
}
}

if (!function_exists('Cake\Core\toString')) {
Copy link
Member

@ADmad ADmad Sep 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check is not necessary, no one else should be defining this namespaced function. Such checks are necessary only when defining global functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx. Fixed in 112b6d8

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these checks necessary for the remaining functions in functions.php?

Comment on lines +356 to +359
try {
$return = json_encode($value, JSON_THROW_ON_ERROR);
} catch (JsonException) {
$return = null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the advantages of using json_encode for a float over a simpler method like printf()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new function toString() combines the advantages of json_encode() and sprintf().

raw input (string)$value json_encode() sprintf() toString()
5.0 5 5 5 5
5.00000003 5.00000003 5.00000003 5.000000029999999818 5.00000003
55666666666.56 55666666666.56 55666666666.56 55666666666.55999755859375 55666666666.56
9223372036778.2233 9223372036778.2 9223372036778.223 9223372036778.22265625 9223372036778.223
88888888888888888888888 8.8888888888889E+22 8.888888888888889e+22 88888888888888890753024 88888888888888890753024

Test:

public function testJsonEncodeVsSprintf(): void
{
    $a = [
        5.0,
        5.00000003,
        55666666666.56,
        9223372036778.2233,
        88888888888888888888888,
    ];

    foreach ($a as $value) {
        $php = (string)$value;
        $json = json_encode($value);
        $sprint = rtrim(sprintf('%.' . (PHP_FLOAT_DIG + 3) . 'F', $value), '.0');
        $toString = toString($value);
        echo '| ' . implode(' | ', [$php, $json, $sprint, $toString]) . ' |' . PHP_EOL;
    }

    $this->markTestSkipped();
}

Comment on lines 400 to 401
* @link https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Number/MAX_SAFE_INTEGER
* 9007199254740991 = 2^53-1 = the maximum safe integer that can be represented without losing precision.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this same constraint applies to PHP. We should be able to go to 2^64-1 or PHP_INT_MAX

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this constant also has the same value in PHP. I'm attaching a test that will find the first values that are no longer correctly typecasted.

Test:

public function testMargin(): void
{
    $margin = 9007199254740991;

    $errors = 0;
    for ($i = $margin - 10_000_000; $i <= $margin + 10_000; $i++) {
        if ((int)(float)$i !== $i) {
            ++$errors;
            echo $i . PHP_EOL;
            if ($errors > 20) {
                break;
            }
        }
    }

    $this->markTestSkipped();
}

Results:

9007199254740993
9007199254740995
9007199254740997
9007199254740999
9007199254741001
9007199254741003
9007199254741005
9007199254741007
9007199254741009
9007199254741011
9007199254741013
9007199254741015
9007199254741017
9007199254741019
9007199254741021
9007199254741023
9007199254741025
9007199254741027
9007199254741029
9007199254741031
9007199254741033

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found in PHP documentation that it might not always be the same constant. It might be something else on exotic systems.

Floating point numbers have limited precision. Although it depends on the system, PHP typically uses the IEEE 754 double precision format..
https://www.php.net/manual/en/language.types.float.php

So I changed af822bb the logic to use (int)$value, but around PHP_INT_MIN/PHP_INT_MAX, it returns very strange values.
Beyond 2^53-1, there are type errors like 9007199254740993.0 ==> 9007199254740992

But I'm okay with this solution. Such large numbers are not used often. And programmers rarely take care of them.

* @return ?string Returns the string representation of the value, or null if the value is not a string.
* @since 5.1.0
*/
function toString(mixed $value): ?string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we should be adding more global functions. When we introduced namespace functions in 5.0 and 4.5, we retained global function aliases as a way to make upgrading easier. With new functions, there is no concern around compatibility, but there could be conflicts with other libraries that define global functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oki. removed in 24c80f7

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally prefer having these as global functions too. They are not autoloaded when using individual packages and for the app too users have the option to opt out.

Copy link
Member

@markstory markstory left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in coming back to this. I think this is ready to merge once the few remaining redundant imports are cleaned up.

@@ -23,6 +23,9 @@
use function Cake\Core\pj as cakePj;
use function Cake\Core\pluginSplit as cakePluginSplit;
use function Cake\Core\pr as cakePr;
use function Cake\Core\toBool as cakeToBool;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we get these imports cleaned up?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done :)

markstory added a commit to cakephp/docs that referenced this pull request Jan 7, 2024
* @return ?string Returns the string representation of the value, or null if the value is not a string.
* @since 5.1.0
*/
function toString(mixed $value): ?string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docblock needs to describe the nuances of each conversion such as bool to a 1 and 0 including failure cases.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can rewrite the docs when this is merged.

src/Core/functions.php Outdated Show resolved Hide resolved
Co-authored-by: othercorey <corey.taylor.fl@gmail.com>
@markstory markstory merged commit a0029a3 into cakephp:5.next Jan 9, 2024
6 checks passed
@markstory
Copy link
Member

Thank you 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants