Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

prevent long strings as int inputs #4480

Merged
merged 3 commits into from Sep 5, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 3 additions & 0 deletions changes/1477-samuelcolvin.md
@@ -0,0 +1,3 @@
Prevent long (length > `4_300`) strings/bytes as input to int fields, see
[python/cpython#95778](https://github.com/python/cpython/issues/95778) and
[CVE-2020-10735](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-10735)
13 changes: 13 additions & 0 deletions pydantic/validators.py
Expand Up @@ -120,10 +120,23 @@ def bool_validator(v: Any) -> bool:
raise errors.BoolError()


# matches the default limit cpython, see https://github.com/python/cpython/pull/96500
max_str_int = 4_300


def int_validator(v: Any) -> int:
if isinstance(v, int) and not (v is True or v is False):
return v

# see https://github.com/pydantic/pydantic/issues/1477 and in turn, https://github.com/python/cpython/issues/95778
# this check should be unnecessary once patch releases are out for 3.7, 3.8, 3.9 and 3.10
# but better to check here until then.
# NOTICE: this does not fully protect user from the DOS risk since the standard library JSON implementation
# (and other std lib modules like xml) use `int()` and are likely called before this, the best workaround is to
# 1. update to the latest patch release of python once released, 2. use a different JSON library like ujson
if isinstance(v, (str, bytes, bytearray)) and len(v) > max_str_int:
Copy link
Contributor

@cmyui cmyui Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe overkill, but should this include memoryview as well?

from pydantic import BaseModel


class A(BaseModel):
    x: int

A(x=memoryview(b"1" * 5000))

this seems like a much less common use case, but perhaps could still be used as a dos in some cases

$ python -m timeit -s 'x=b"1" * 100_000' 'int(memoryview(x))'
5 loops, best of 5: 59.6 msec per loop

i can make a pr to address this

raise errors.IntegerError()

try:
return int(v)
except (TypeError, ValueError, OverflowError):
Expand Down
35 changes: 35 additions & 0 deletions tests/test_edge_cases.py
Expand Up @@ -2038,3 +2038,38 @@ class Custom:
__fields__ = True

assert not issubclass(Custom, BaseModel)


def test_long_int():
"""
see https://github.com/pydantic/pydantic/issues/1477 and in turn, https://github.com/python/cpython/issues/95778
"""

class Model(BaseModel):
x: int

assert Model(x='1' * 4_300).x == int('1' * 4_300)
assert Model(x=b'1' * 4_300).x == int('1' * 4_300)
assert Model(x=bytearray(b'1' * 4_300)).x == int('1' * 4_300)

too_long = '1' * 4_301
with pytest.raises(ValidationError) as exc_info:
Model(x=too_long)

assert exc_info.value.errors() == [
{
'loc': ('x',),
'msg': 'value is not a valid integer',
'type': 'type_error.integer',
},
]

too_long_b = too_long.encode('utf-8')
with pytest.raises(ValidationError):
Model(x=too_long_b)
with pytest.raises(ValidationError):
Model(x=bytearray(too_long_b))

# this used to hang indefinitely
with pytest.raises(ValidationError):
Model(x='1' * (10**7))