Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: Unexpected behaviour: parse_numbers() doesn't handle space as a grouping symbol, however, parse_decimal() does #1061 #1062

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

sebas-inf
Copy link

Cleaned the string to ensure that it still works if there is a non-breakable space, \xa0, in the number and separed the logic so that it is easeir to understand.

Copy link
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've committed and pushed your entire .venv – please clean that up first.

@sebas-inf
Copy link
Author

You've committed and pushed your entire .venv – please clean that up first.

Should I just push the file I changed or everything except the .venv?

@akx
Copy link
Member

akx commented Jan 31, 2024

Should I just push the file I changed or everything except the .venv?

Your commit(s) should only have relevant changes.

@sebas-inf
Copy link
Author

Should I just push the file I changed or everything except the .venv?

Your commit(s) should only have relevant changes.

Ok, I will push the file with the fix only then.

@akx
Copy link
Member

akx commented Jan 31, 2024

You still have 755 files in this pull request. You'll probably need to git reset 40e60a1f6cf178d9f57fcc14f157ea1b2ab77361 (the current master), then carefully git add only the changes you want, git commit them, and git push --force-with-lease them to your repository.

@sebas-inf
Copy link
Author

You still have 755 files in this pull request. You'll probably need to git reset 40e60a1f6cf178d9f57fcc14f157ea1b2ab77361 (the current master), then carefully git add only the changes you want, git commit them, and git push --force-with-lease them to your repository.

should i do all of this within the terminal in vscode where i was working?

@akx
Copy link
Member

akx commented Jan 31, 2024

should i do all of this within the terminal in vscode where i was working?

Maybe. I don't use Visual Studio Code much. Visual Studio Code might have other tools for backing out commits.

@sebas-inf
Copy link
Author

sebas-inf commented Jan 31, 2024

should i do all of this within the terminal in vscode where i was working?

Maybe. I don't use Visual Studio Code much. Visual Studio Code might have other tools for backing out commits.

It worked, I made it so i only pushed the numbers.py file.

Copy link
Member

@akx akx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to err on the side of caution here, if there is the remotest chance this could cause mis-parsing of numbers (e.g. something that couldn't be parsed before being now accidentally parsed as something it's not intended to be parsed as).

To that end, maybe this function should also have a strict=True parameter like parse_decimal does, and then maybe the grouping character logic could be shared between the two in non-strict mode?

babel/numbers.py Outdated
group_symbol = get_group_symbol(locale, numbering_system=numbering_system)

# Replace non-breakable spaces with the group symbol
cleaned_string = string.replace('\xa0', group_symbol)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test case (either a doctest or a real test in tests/) that uses an \xa0 character?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be a valid test case for that: >>> parse_number('1 099', locale='ru')

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My human eyes can't tell if that space is a \xA0. You could spell it out there, if it really is one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made some changes applying some of the logic from parse_decimal and it says its passing the test cases on my end

@@ -1001,6 +1001,7 @@ def __init__(self, message: str, suggestions: list[str] | None = None) -> None:
def parse_number(
string: str,
locale: Locale | str | None = LC_NUMERIC,
strict: bool = False,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add the new argument into the docstring. strict mode should probably be the default here, and the new behavior (which allows more loose strings to be parsed) should be opt-in.

babel/numbers.py Outdated
@@ -1010,6 +1011,8 @@ def parse_number(
1099
>>> parse_number('1.099', locale='de_DE')
1099
>>> parse_number('1 099', locale='ru')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the intent is to test a string that has a non-breaking space character, make it obvious:

Suggested change
>>> parse_number('1 099', locale='ru')
>>> parse_number('1\xa0099', locale='ru')

babel/numbers.py Outdated
Comment on lines 1044 to 1085
decimal_symbol = get_decimal_symbol(locale, numbering_system=numbering_system)

try:
parsed = decimal.Decimal(string.replace(group_symbol, '')
.replace(decimal_symbol, '.'))
except decimal.InvalidOperation as exc:
raise NumberFormatError(f"{string!r} is not a valid number") from exc

proper = format_decimal(parsed, locale=locale, decimal_quantization=False, numbering_system=numbering_system)

if string != proper and proper != _remove_trailing_zeros_after_decimal(string, decimal_symbol):
try:
parsed_alt = decimal.Decimal(string.replace(decimal_symbol, '')
.replace(group_symbol, '.'))
except decimal.InvalidOperation as exc:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}?",
suggestions=[proper],
) from exc
else:
proper_alt = format_decimal(
parsed_alt,
locale=locale,
decimal_quantization=False,
numbering_system=numbering_system,
)
if proper_alt == proper:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}?",
suggestions=[proper],
)
else:
raise NumberFormatError(
f"{string!r} is not a properly formatted decimal number. "
f"Did you mean {proper!r}? Or maybe {proper_alt!r}?",
suggestions=[proper, proper_alt],
)

# Remove other spaces and replace the group symbol with an empty string
cleaned_string = string.replace(' ', '').replace(group_symbol, '')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format_number is not meant to parse decimal numbers, only ever integers, anything decimal-related shouldn't be here at all.

If anything, we should check for the existence of a decimal separator in the input in strict mode, and fail if it is there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

made the changes in the latest commit

babel/numbers.py Outdated

# If strict mode is enabled, handle irregular formatting
if strict and group_symbol in string:
# Your additional logic for strict mode here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a strange comment... 😁

@sebas-inf sebas-inf requested a review from akx February 6, 2024 02:41
@akx
Copy link
Member

akx commented Apr 24, 2024

This will require rebasing, and there are still unresolved comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants