Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integer parsing: always detect overflows #544

Merged
merged 3 commits into from Jun 1, 2022
Merged

Conversation

NaN-git
Copy link
Contributor

@NaN-git NaN-git commented May 28, 2022

Fixes #440

The current logic for detecting overflows while parsing large integer only detects some overflows. Each calculation like multiplication and addition has to be verified to be in bounds because otherwise it might silently overflow.

This PR fixes overflow detection and out of bounds checks for integer numbers with large absolute value. It replaces the current bounds checking and overflow checking logic.
The implementation should be quite efficient because the checks only use additions and comparisons of 64bit integers in the parsing loop and it handles negative numbers without additional checks. Furthermore it correctly parses LLONG_MIN on 2s complement architectures, where -LLONG_MIN - 1 = LLONG_MAX should be satisfied. I haven't added a test for this because there is no guarantee that it will work on all architectures, i.e. -LLONG_MIN = LLONG_MAX is also possible. Maybe a test should be added because basically all CPUs use 2s complement representation of signed integers.

PR #543 seems to use a similar idea, but I only noticed it after writing the patch. It uses divisions in the parsing loop and I'm not sure whether it would parse all negative integers correctly.

Changes proposed in this pull request:

  • add tests which fail with the current implementation
  • fix overflow detection while parsing integer numbers
  • make error messages consistent when parsing too large or too small numbers
  • change the formatting of head of function

@codecov-commenter
Copy link

codecov-commenter commented May 29, 2022

Codecov Report

Merging #544 (0a0e111) into main (b300d64) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main     #544   +/-   ##
=======================================
  Coverage   91.76%   91.76%           
=======================================
  Files           6        6           
  Lines        1821     1821           
=======================================
  Hits         1671     1671           
  Misses        150      150           
Impacted Files Coverage Δ
tests/test_ujson.py 99.60% <ø> (ø)
lib/ultrajsondec.c 91.66% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b300d64...0a0e111. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
changelog: Fixed For any bug fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UltraJSON has inconsistent behavior on parsing large integral values.
4 participants