Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_pydecimal: int to and from Decimal conversion fails for large ints #96589

Closed
mdickinson opened this issue Sep 5, 2022 · 9 comments
Closed
Labels
type-bug An unexpected behavior, bug, or error

Comments

@mdickinson
Copy link
Member

Bug report

Since the fix for #95778 was merged, conversions from large ints to Decimal instances fail, but only for the Python version of the decimal module. The C version is unaffected.

Here's the _pydecimal behaviour: conversion large-int -> Decimal raises; conversion from Decimal back to int does not:

mdickinson@lovelace cpython % ./python.exe
Python 3.12.0a0 (heads/main:a9d58feccf, Sep  5 2022, 17:17:03) [Clang 13.1.6 (clang-1316.0.21.2.5)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from _pydecimal import Decimal
>>> d = Decimal(10**10000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mdickinson/Repositories/python/cpython/Lib/_pydecimal.py", line 602, in __new__
    self._int = str(abs(value))
                ^^^^^^^^^^^^^^^
ValueError: Exceeds the limit (4300) for integer string conversion
>>> n = int(Decimal(10)**10000)  # works as expected

Given that almost no CPython users should be using _pydecimal, this probably doesn't matter much, but it seemed worth recording.

@mdickinson mdickinson added the type-bug An unexpected behavior, bug, or error label Sep 5, 2022
@mdickinson
Copy link
Member Author

Any objections to closing this as "won't fix"?

@mdickinson
Copy link
Member Author

And of course the conversion fails in both directions, not just one. Here's the Decimal to int conversion. We temporarily disable the limit to create the high-precision Decimal in the first place.

>>> from _pydecimal import Decimal
>>> n = 10**100000
>>> import sys
>>> sys.set_int_max_str_digits(0)
>>> d = Decimal(n)
>>> sys.set_int_max_str_digits(4300)
>>> m = int(d)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/mdickinson/Repositories/python/cpython/Lib/_pydecimal.py", line 1639, in __int__
    return s*int(self._int)*10**self._exp
             ^^^^^^^^^^^^^^
ValueError: Exceeds the limit (4300) for integer string conversion: value has 100001 digits

@mdickinson mdickinson changed the title _pydecimal: int to Decimal conversion fails for large ints _pydecimal: int to and from Decimal conversion fails for large ints Sep 5, 2022
@serhiy-storchaka
Copy link
Member

Decimal(10**1000000) with the C implementation is slow. Should not it rather raise an exception?

@mdickinson
Copy link
Member Author

Should not it rather raise an exception?

Perhaps. I'm already very uncomfortable/unhappy with this solution to the str<->int conversion problem, though, and I'm wondering where it ends. I'd much prefer not to take any similar action for int <-> Decimal conversions unless there's a demonstrated and plausible security problem.

@gpshead
Copy link
Member

gpshead commented Sep 5, 2022

This _pydecimal incompatibility was discovered during development over on our private PSRT PR. _pydecimal is not used by CPython users and exists solely as a reference implementation useful for prototyping or for use on other non-CPython implementations so we decided not to worry about it.

If we had context based limits instead of interpreter-wide limit the _pydecimal implementation could just disable it during its own operations without impacting other threads. It could look something like:

with sys.adjusted_int_max_str_digits_context(0):  # This API does not exist.
    return s*int(self._int)*10**self._exp

But even if that existed, it isn't satisfactory.

As far as denial of service via Decimal itself (the C version), that can be its own issue if anyone cares. It seems unlikely for anyone to be using Decimal in situations where untrusted data exists. Unlike int.

I do believe fancier base conversion algorithms as discussed in #90716 could be used within all implementations of decimal.

I'm closing this as Won't Fix as suggested, thanks for filing it, it is good to record the problem, decision, and reasoning.

@gpshead gpshead closed this as completed Sep 5, 2022
@gpshead gpshead closed this as not planned Won't fix, can't repro, duplicate, stale Sep 5, 2022
@stevendaprano
Copy link
Member

ValueError: Exceeds the limit (4300) for integer string conversion

I can't find any public record of this vulnerability so its impossible to tell how critical it is. But it doesn't sound like it is critical. Conversion of large int <-> str has existed forever in Python, since 1.5 at least. And we've had "multiple [reports] by a few different people since early 2020" with little or no action for over two years, until bang just like that we've pushed out a major functional regression with no obvious public discussion and no PEP in less than a month.

Why the secrecy? It's not like "pass a gigantic string to the application and see if you can DOS it or overflow a buffer" is an unknown attack technique.

I get it that this is a real vulnerability that needs to be fixed, but it's a real vulnerability which has been possible for two decades or more. Similar issues have been public knowledge for at least four years and discussed widely. The security team has known about it for over two years.

It would have been polite and considerate if the people who are going to be affected by the fix could have had a chance to discuss the functional changes to the language itself and not just presented with a fait accompli. There's a reason why we're supposed to discussing breaking changes to the language. It's only by chance that I noticed this ticket.

I'm not just having a rant here to get this off my chest. I'd like to see some genuine discussion about the functional issues caused by this fix, while 3.11 is still in beta and we're not stuck with the initial design. Maybe this ticket isn't the right place, but I have to start somewhere. It's only by chance that I noticed this ticket. Maybe it should go to the Python-Dev mailing list? I don't know.

Thanks for listening.

@arhadthedev
Copy link
Member

arhadthedev commented Sep 6, 2022

3.11 is still in beta

It's a release candidate (according to the schedule) so the exception is the least bug-prone solution until a possible architectural rework in 3.12.

By the way, yesterday's 3.11rc2 is already delayed until all newfound 3.11 release blockers are fixed. That's why new features are prohibited in betas and release candidates, because alphas already introduced enough bugs that delayed 3.11b4 for 14 days, and 3.11rc1 for three days after a weekend crunch threatening shifting the final release for a month.

@gpshead
Copy link
Member

gpshead commented Sep 6, 2022

The proper place for further public discussion is discuss.python.org. Not buried here.

@oscardssmith
Copy link

Why not? Github is a very normal place to discuss regressions in software. This is a breaking regression that has been included in a patch release, so IMO, it really needs to be reverted.

@python python locked and limited conversation to collaborators Sep 8, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

6 participants