Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Unexpected behaviour: parse_numbers() doesn't handle space as a grouping symbol, however, parse_decimal() does #1061 #1062

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
61 changes: 60 additions & 1 deletion babel/numbers.py
Original file line number Diff line number Diff line change
Expand Up @@ -1001,6 +1001,7 @@ def __init__(self, message: str, suggestions: list[str] | None = None) -> None:
def parse_number(
string: str,
locale: Locale | str | None = LC_NUMERIC,
strict: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the new argument into the docstring. strict mode should probably be the default here, and the new behavior (which allows more loose strings to be parsed) should be opt-in.

*,
numbering_system: Literal["default"] | str = "latn",
) -> int:
Expand All @@ -1010,6 +1011,8 @@ def parse_number(
1099
>>> parse_number('1.099', locale='de_DE')
1099
>>> parse_number('1 099', locale='ru')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent is to test a string that has a non-breaking space character, make it obvious:

Suggested change
>>> parse_number('1 099', locale='ru')
>>> parse_number('1\xa0099', locale='ru')

1099

When the given string cannot be parsed, an exception is raised:

Expand All @@ -1027,11 +1030,67 @@ def parse_number(
:raise `UnsupportedNumberingSystemError`: if the numbering system is not supported by the locale.
"""
try:
return int(string.replace(get_group_symbol(locale, numbering_system=numbering_system), ''))
# Get the group symbol from the locale
group_symbol = get_group_symbol(locale, numbering_system=numbering_system)

# Replace non-breakable spaces with the group symbol in non-strict mode
if not strict and group_symbol == '\xa0' and '\xa0' not in string and ' ' in string:
string = string.replace(' ', group_symbol)

# If strict mode is enabled, handle irregular formatting
if strict and group_symbol in string:
# Your additional logic for strict mode here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a strange comment... 😁

locale = Locale.parse(locale)
decimal_symbol = get_decimal_symbol(locale, numbering_system=numbering_system)

try:
parsed = decimal.Decimal(string.replace(group_symbol, '')
.replace(decimal_symbol, '.'))
except decimal.InvalidOperation as exc:
raise NumberFormatError(f"{string!r} is not a valid number") from exc

proper = format_decimal(parsed, locale=locale, decimal_quantization=False, numbering_system=numbering_system)

if string != proper and proper != _remove_trailing_zeros_after_decimal(string, decimal_symbol):
try:
parsed_alt = decimal.Decimal(string.replace(decimal_symbol, '')
.replace(group_symbol, '.'))
except decimal.InvalidOperation as exc:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}?",
suggestions=[proper],
) from exc
else:
proper_alt = format_decimal(
parsed_alt,
locale=locale,
decimal_quantization=False,
numbering_system=numbering_system,
)
if proper_alt == proper:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}?",
suggestions=[proper],
)
else:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}? Or maybe {proper_alt!r}?",
suggestions=[proper, proper_alt],
)

# Remove other spaces and replace the group symbol with an empty string
cleaned_string = string.replace(' ', '').replace(group_symbol, '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_number is not meant to parse decimal numbers, only ever integers, anything decimal-related shouldn't be here at all.

If anything, we should check for the existence of a decimal separator in the input in strict mode, and fail if it is there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the changes in the latest commit


return int(cleaned_string)
except ValueError as ve:
raise NumberFormatError(f"{string!r} is not a valid number") from ve




def parse_decimal(
string: str,
locale: Locale | str | None = LC_NUMERIC,
Expand Down
1 change: 1 addition & 0 deletions tests/test_numbers.py
Original file line number Diff line number Diff line change
Expand Up @@ -741,6 +741,7 @@ def test_parse_number():
assert numbers.parse_number('1,099', locale='en_US') == 1099
assert numbers.parse_number('1.099', locale='de_DE') == 1099
assert numbers.parse_number('1٬099', locale='ar_EG', numbering_system="default") == 1099
assert numbers.parse_number('1\xa0099', locale='en_US') == 1099

with pytest.raises(numbers.NumberFormatError) as excinfo:
numbers.parse_number('1.099,98', locale='de')
Expand Down