New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix wrap around issues on large numerics #543
Conversation
There seemed to be a mistake on LLONG_MAX vs ULLONG_MAX (the latter is the maximum value for an object of type `unsigned long long int` which JSUINT64 should be). I fixed that. I am unsure about how much performance is lost on the check for overflow on digit check...
for more information, see https://pre-commit.ci
Codecov Report
@@ Coverage Diff @@
## main #543 +/- ##
==========================================
+ Coverage 91.76% 91.77% +0.01%
==========================================
Files 6 6
Lines 1821 1824 +3
==========================================
+ Hits 1671 1674 +3
Misses 150 150
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bear with me here - this stuff makes my head hurt...
@@ -169,7 +171,7 @@ static FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric (struct DecoderState *ds | |||
} | |||
else if (intNeg == -1) | |||
{ | |||
return SetError(ds, -1, overflowLimit == LLONG_MAX ? "Value is too big!" : "Value is too small"); | |||
return SetError(ds, -1, overflowLimit == LLONG_MAX ? "Value is too big" : "Value is too small"); | |||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the overflowLimit == LLONG_MAX ? ...
switch not redundant because of the if (intNeg == 1)
and else if (intNeg == -1)
conditions above it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(P.S. If we're ditching !
s in error messages then there's that one still lurking on line 170.)
@@ -131,15 +131,17 @@ static FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric (struct DecoderState *ds | |||
case '8': | |||
case '9': | |||
{ | |||
// detect overflow on digit shift | |||
if ((intValue * 10ULL) % 10 != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would if (intValue > ULLONG_MAX / 10)
not be equivalent? It's less headache inducing and probably fractionally more performant since the ULLONG_MAX / 10
will be evaluated at compile time leaving just the >
operation at runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that's functionally identical. In both cases, an intValue
of ULLONG_MAX / 10 = 1844674407370955161
would pass the check, as it should since it does not overflow on the multiplication. It might still overflow on the addition if the next digit is 6 to 9, but that would be caught in the check below. And intValue = ULLONG_MAX / 10 + 1
would fail this added check in both forms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise, integer arithmetic is bad enough as it is, but decoding from a string definitely induces headaches.
{ | ||
hasError = 1; | ||
} | ||
else if (intNeg == -1 && intValue > overflowLimit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't the removal of this check mean that there is no test against the smaller overflowLimit
of LLONG_MAX
for negative numbers anymore? Then we could get an overflow further down in the BREAK_INT_LOOP
section.
@pytest.mark.parametrize( | ||
"test_input", | ||
[ | ||
("33333333303333333333"), | ||
("18446744073709551616"), # 64 bit | ||
("-18446744073709551616"), # 64 bit | ||
("-80888888888888888888"), | ||
], | ||
) | ||
def test_decode_big_numeric(test_input): | ||
with pytest.raises(ujson.JSONDecodeError): | ||
ujson.loads(test_input) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be merged with test_decode_raises
, which already has some tests for big ints. I have no strong opinion on whether it should all be in one test or not, though I tend towards yes (since the test code is basically identical, just with different input).
I'll close this PR and recommend looking at #544 instead. |
There seemed to be a mistake on LLONG_MAX vs ULLONG_MAX (the latter is
the maximum value for an object of type
unsigned long long int
whichJSUINT64 should be). I fixed that.
I am unsure about how much performance is lost on the check for overflow
on digit check...
Fixes #440
Changes proposed in this pull request: