Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix wrap around issues on large numerics #543

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
16 changes: 9 additions & 7 deletions lib/ultrajsondec.c
Expand Up @@ -96,7 +96,7 @@ static FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric (struct DecoderState *ds
int chr;
char *offset = ds->start;

JSUINT64 overflowLimit = LLONG_MAX;
JSUINT64 overflowLimit = ULLONG_MAX;

if (*(offset) == 'I') {
goto DECODE_INF;
Expand Down Expand Up @@ -131,15 +131,17 @@ static FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric (struct DecoderState *ds
case '8':
case '9':
{
// detect overflow on digit shift
if ((intValue * 10ULL) % 10 != 0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would if (intValue > ULLONG_MAX / 10) not be equivalent? It's less headache inducing and probably fractionally more performant since the ULLONG_MAX / 10 will be evaluated at compile time leaving just the > operation at runtime.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think that's functionally identical. In both cases, an intValue of ULLONG_MAX / 10 = 1844674407370955161 would pass the check, as it should since it does not overflow on the multiplication. It might still overflow on the addition if the next digit is 6 to 9, but that would be caught in the check below. And intValue = ULLONG_MAX / 10 + 1 would fail this added check in both forms.

{
hasError = 1;
}

//PERF: Don't do 64-bit arithmetic here unless we know we have to
prevIntValue = intValue;
intValue = intValue * 10ULL + (JSLONG) (chr - 48);

if (intNeg == 1 && prevIntValue > intValue)
{
hasError = 1;
}
else if (intNeg == -1 && intValue > overflowLimit)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the removal of this check mean that there is no test against the smaller overflowLimit of LLONG_MAX for negative numbers anymore? Then we could get an overflow further down in the BREAK_INT_LOOP section.

if (prevIntValue > intValue)
{
hasError = 1;
}
Expand Down Expand Up @@ -169,7 +171,7 @@ static FASTCALL_ATTR JSOBJ FASTCALL_MSVC decode_numeric (struct DecoderState *ds
}
else if (intNeg == -1)
{
return SetError(ds, -1, overflowLimit == LLONG_MAX ? "Value is too big!" : "Value is too small");
return SetError(ds, -1, overflowLimit == LLONG_MAX ? "Value is too big!" : "Value is too small!");
hugovk marked this conversation as resolved.
Show resolved Hide resolved
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the overflowLimit == LLONG_MAX ? ... switch not redundant because of the if (intNeg == 1) and else if (intNeg == -1) conditions above it?

Copy link
Collaborator

@bwoodsend bwoodsend May 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(P.S. If we're ditching !s in error messages then there's that one still lurking on line 170.)

}
goto BREAK_INT_LOOP;
Expand Down
14 changes: 14 additions & 0 deletions tests/test_ujson.py
Expand Up @@ -1055,3 +1055,17 @@ def test_decode_string_utf8():
heap = hp.heapu()
print(heap)
"""


@pytest.mark.parametrize(
"test_input",
[
("33333333303333333333"),
("18446744073709551616"), # 64 bit
("-18446744073709551616"), # 64 bit
("-80888888888888888888"),
],
)
def test_decode_big_numeric(test_input):
with pytest.raises(ujson.JSONDecodeError):
ujson.loads(test_input)
Comment on lines +1060 to +1071
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be merged with test_decode_raises, which already has some tests for big ints. I have no strong opinion on whether it should all be in one test or not, though I tend towards yes (since the test code is basically identical, just with different input).