Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.10] gh-95778: Correctly pre-check for int-to-str conversion (GH-96537) #96563

Merged
merged 1 commit into from Sep 4, 2022

Conversation

gpshead
Copy link
Member

@gpshead gpshead commented Sep 4, 2022

Converting a large enough int to a decimal string raises ValueError as expected. However, the raise comes after the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)

The quick fix: essentially we catch most values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.

The justification for the current check. The C code check is:

max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10

In GitHub markdown math-speak, writing $M$ for max_str_digits, $L$ for PyLong_SHIFT and $s$ for size_a, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$

From this it follows that
$$\frac{M}{3L} &lt; \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} &gt; \frac{10}{3} &gt; \log_2(10).$$
So
$$2^{L(s-1)} &gt; 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything below the intended limit in the check.

Co-authored-by: Gregory P. Smith [Google LLC] greg@krypto.org
(cherry picked from commit b126196)

Co-authored-by: Mark Dickinson dickinsm@gmail.com

…ythonGH-96537)

Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)

The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.

The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```

In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$

From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.

<!-- gh-issue-number: pythongh-95778 -->
* Issue: pythongh-95778
<!-- /gh-issue-number -->

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
(cherry picked from commit b126196)

Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
@gpshead gpshead added type-bug An unexpected behavior, bug, or error type-security A security issue release-blocker labels Sep 4, 2022
@gpshead gpshead merged commit eace09e into python:3.10 Sep 4, 2022
@gpshead gpshead deleted the backport-b126196-3.10 branch September 4, 2022 16:55
facebook-github-bot pushed a commit to facebookincubator/cinder that referenced this pull request Jan 20, 2023
…96563)

Summary:
cherry-picked the upstream 3.10 backport

```
git cherry-pick 8f0fa4b eace09e
```

this one is python/cpython#96563

original commit message below

--------

Converting a large enough `int` to a decimal string raises `ValueError` as expected. However, the raise comes _after_ the quadratic-time base-conversion algorithm has run to completion. For effective DOS prevention, we need some kind of check before entering the quadratic-time loop. Oops! =)

The quick fix: essentially we catch _most_ values that exceed the threshold up front. Those that slip through will still be on the small side (read: sufficiently fast), and will get caught by the existing check so that the limit remains exact.

The justification for the current check. The C code check is:
```c
max_str_digits / (3 * PyLong_SHIFT) <= (size_a - 11) / 10
```

In GitHub markdown math-speak, writing $M$ for `max_str_digits`, $L$ for `PyLong_SHIFT` and $s$ for `size_a`, that check is:
$$\left\lfloor\frac{M}{3L}\right\rfloor \le \left\lfloor\frac{s - 11}{10}\right\rfloor$$

From this it follows that
$$\frac{M}{3L} < \frac{s-1}{10}$$
hence that
$$\frac{L(s-1)}{M} > \frac{10}{3} > \log_2(10).$$
So
$$2^{L(s-1)} > 10^M.$$
But our input integer $a$ satisfies $|a| \ge 2^{L(s-1)}$, so $|a|$ is larger than $10^M$. This shows that we don't accidentally capture anything _below_ the intended limit in the check.

<!-- gh-issue-number: gh-95778 -->
* Issue: gh-95778
<!-- /gh-issue-number -->

(cherry picked from commit b126196)

Reviewed By: alexmalyshev

Differential Revision: D39369517

fbshipit-source-id: 750f9c3

Co-authored-by: Gregory P. Smith [Google LLC] <greg@krypto.org>
Co-authored-by: Mark Dickinson <dickinsm@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-blocker skip news type-bug An unexpected behavior, bug, or error type-security A security issue
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants