Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test failures on babel version 2.14.0 #1059

Closed
nileshpatra opened this issue Jan 21, 2024 · 10 comments
Closed

Test failures on babel version 2.14.0 #1059

nileshpatra opened this issue Jan 21, 2024 · 10 comments

Comments

@nileshpatra
Copy link

Overview Description

While upgrading the Debian package to latest version I am observing a bunch of test failures on some locales due to minor changes. I'm not sure if the expected output should be changed for these assertions.

Steps to Reproducewhere I have no idea

if it makes sense to simply skip/patch.

Run the test suite with: LC_ALL=C py.test-3

Actual Results

=================================== FAILURES ===================================
______________ [doctest] babel.dates.DateTimeFormat.format_period ______________
1505         u'iltapäivä'
1506         >>> format.format_period('B', 4)
1507         u'iltapäivällä'
1508         >>> format.format_period('B', 5)
1509         u'ip.'
1510 
1511         >>> format = DateTimeFormat(datetime(2022, 4, 28, 6, 27), 'zh_Hant')
1512         >>> format.format_period('a', 1)
1513         u'上午'
1514         >>> format.format_period('b', 1)
UNEXPECTED EXCEPTION: ValueError('Could not format period morning1 in zh_Hant')
Traceback (most recent call last):
  File "/usr/lib/python3.11/doctest.py", line 1353, in __run
    exec(compile(example.source, filename, "single",
  File "<doctest babel.dates.DateTimeFormat.format_period[9]>", line 1, in <module>
  File "/<<PKGBUILDDIR>>/babel/dates.py", line 1535, in format_period
    raise ValueError(f"Could not format period {period} in {self.locale}")
ValueError: Could not format period morning1 in zh_Hant
/<<PKGBUILDDIR>>/babel/dates.py:1514: UnexpectedException
__________________ [doctest] babel.numbers.format_scientific ___________________
954 Return value formatted in scientific notation for a specific locale.
955 
956     >>> format_scientific(10000, locale='en_US')
957     u'1E4'
958     >>> format_scientific(10000, locale='ar_EG', numbering_system='default')
Expected:
    u'1اس4'
Got:
    '1أس4'

/<<PKGBUILDDIR>>/babel/numbers.py:958: DocTestFailure
________________ [doctest] babel.numbers.get_exponential_symbol ________________
416 Return the symbol used by the locale to separate mantissa and exponent.
417 
418     >>> get_exponential_symbol('en_US')
419     u'E'
420     >>> get_exponential_symbol('ar_EG', numbering_system='default')
Expected:
    u'اس'
Got:
    'أس'

/<<PKGBUILDDIR>>/babel/numbers.py:420: DocTestFailure
__________________ [doctest] babel.units.format_compound_unit __________________
242     >>> format_compound_unit(1234.5, "ton", 15, denominator_unit="hour", locale="ar_EG", numbering_system="arab")
243     '1٬234٫5 طن لكل 15 ساعة'
244 
245     >>> format_compound_unit(160, denominator_unit="square-meter", locale="fr")
246     '160 par m\xe8tre carr\xe9'
247 
248     >>> format_compound_unit(4, "meter", "ratakisko", length="short", locale="fi")
249     '4 m/ratakisko'
250 
251     >>> format_compound_unit(35, "minute", denominator_unit="fathom", locale="sv")
Expected:
    '35 minuter per famn'
Got:
    '35 minuter per length-fathom'

/<<PKGBUILDDIR>>/babel/units.py:251: DocTestFailure
______________________ FormatDecimalTestCase.test_compact ______________________

self = <tests.test_numbers.FormatDecimalTestCase testMethod=test_compact>

    def test_compact(self):
        assert numbers.format_compact_decimal(1, locale='en_US', format_type="short") == '1'
        assert numbers.format_compact_decimal(999, locale='en_US', format_type="short") == '999'
        assert numbers.format_compact_decimal(1000, locale='en_US', format_type="short") == '1K'
        assert numbers.format_compact_decimal(9000, locale='en_US', format_type="short") == '9K'
        assert numbers.format_compact_decimal(9123, locale='en_US', format_type="short", fraction_digits=2) == '9.12K'
        assert numbers.format_compact_decimal(10000, locale='en_US', format_type="short") == '10K'
        assert numbers.format_compact_decimal(10000, locale='en_US', format_type="short", fraction_digits=2) == '10K'
        assert numbers.format_compact_decimal(1000000, locale='en_US', format_type="short") == '1M'
        assert numbers.format_compact_decimal(9000999, locale='en_US', format_type="short") == '9M'
        assert numbers.format_compact_decimal(9000900099, locale='en_US', format_type="short", fraction_digits=5) == '9.0009B'
        assert numbers.format_compact_decimal(1, locale='en_US', format_type="long") == '1'
        assert numbers.format_compact_decimal(999, locale='en_US', format_type="long") == '999'
        assert numbers.format_compact_decimal(1000, locale='en_US', format_type="long") == '1 thousand'
        assert numbers.format_compact_decimal(9000, locale='en_US', format_type="long") == '9 thousand'
        assert numbers.format_compact_decimal(9000, locale='en_US', format_type="long", fraction_digits=2) == '9 thousand'
        assert numbers.format_compact_decimal(10000, locale='en_US', format_type="long") == '10 thousand'
        assert numbers.format_compact_decimal(10000, locale='en_US', format_type="long", fraction_digits=2) == '10 thousand'
        assert numbers.format_compact_decimal(1000000, locale='en_US', format_type="long") == '1 million'
        assert numbers.format_compact_decimal(9999999, locale='en_US', format_type="long") == '10 million'
        assert numbers.format_compact_decimal(9999999999, locale='en_US', format_type="long", fraction_digits=5) == '10 billion'
        assert numbers.format_compact_decimal(1, locale='ja_JP', format_type="short") == '1'
        assert numbers.format_compact_decimal(999, locale='ja_JP', format_type="short") == '999'
        assert numbers.format_compact_decimal(1000, locale='ja_JP', format_type="short") == '1000'
        assert numbers.format_compact_decimal(9123, locale='ja_JP', format_type="short") == '9123'
        assert numbers.format_compact_decimal(10000, locale='ja_JP', format_type="short") == '1万'
>       assert numbers.format_compact_decimal(1234567, locale='ja_JP', format_type="long") == '123万'

tests/test_numbers.py:167: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
babel/numbers.py:616: in format_compact_decimal
    compact_format = locale.compact_decimal_formats[format_type]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <babel.localedata.LocaleDataDict object at 0x7f4026072950>, key = 'long'

    def __getitem__(self, key: str | int | None) -> Any:
>       orig = val = self._data[key]
E       KeyError: 'long'

babel/localedata.py:234: KeyError
_________________________ test_get_exponential_symbol __________________________

    def test_get_exponential_symbol():
        assert numbers.get_exponential_symbol('en_US') == 'E'
        assert numbers.get_exponential_symbol('en_US', numbering_system="latn") == 'E'
        assert numbers.get_exponential_symbol('en_US', numbering_system="default") == 'E'
        assert numbers.get_exponential_symbol('ja_JP') == 'E'
        assert numbers.get_exponential_symbol('ar_EG') == 'E'
>       assert numbers.get_exponential_symbol('ar_EG', numbering_system="default") == 'اس'
E       AssertionError: assert 'أس' == 'اس'
E         - اس
E         + أس

tests/test_numbers.py:376: AssertionError
____________________ test_format_currency_long_display_name ____________________

    def test_format_currency_long_display_name():
        assert (numbers.format_currency(1099.98, 'USD', locale='en_US', format_type='name')
                == '1,099.98 US dollars')
        assert (numbers.format_currency(1099.98, 'USD', locale='en_US', format_type='name', numbering_system="default")
                == '1,099.98 US dollars')
        assert (numbers.format_currency(1099.98, 'USD', locale='ar_EG', format_type='name', numbering_system="default")
                == '1٬099٫98 دولار أمريكي')
        assert (numbers.format_currency(1.00, 'USD', locale='en_US', format_type='name')
                == '1.00 US dollar')
        assert (numbers.format_currency(1.00, 'EUR', locale='en_US', format_type='name')
                == '1.00 euro')
        assert (numbers.format_currency(2, 'EUR', locale='en_US', format_type='name')
                == '2.00 euros')
        # This tests that '{1} {0}' unitPatterns are found:
>       assert (numbers.format_currency(1, 'USD', locale='sw', format_type='name')
                == 'dola ya Marekani 1.00')
E       AssertionError: assert '1.00 dola ya Marekani' == 'dola ya Marekani 1.00'
E         - dola ya Marekani 1.00
E         ?                 -----
E         + 1.00 dola ya Marekani
E         ? +++++

tests/test_numbers.py:596: AssertionError
____________________________ test_format_scientific ____________________________

    def test_format_scientific():
        assert numbers.format_scientific(10000, locale='en_US') == '1E4'
        assert numbers.format_scientific(10000, locale='en_US', numbering_system="default") == '1E4'
        assert numbers.format_scientific(4234567, '#.#E0', locale='en_US') == '4.2E6'
        assert numbers.format_scientific(4234567, '0E0000', locale='en_US') == '4.234567E0006'
        assert numbers.format_scientific(4234567, '##0E00', locale='en_US') == '4.234567E06'
        assert numbers.format_scientific(4234567, '##00E00', locale='en_US') == '42.34567E05'
        assert numbers.format_scientific(4234567, '0,000E00', locale='en_US') == '4,234.567E03'
        assert numbers.format_scientific(4234567, '##0.#####E00', locale='en_US') == '4.23457E06'
        assert numbers.format_scientific(4234567, '##0.##E00', locale='en_US') == '4.23E06'
        assert numbers.format_scientific(42, '00000.000000E0000', locale='en_US') == '42000.000000E-0003'
>       assert numbers.format_scientific(0.2, locale="ar_EG", numbering_system="default") == '2اس\u061c-1'
E       AssertionError: assert '2أس\u061c-1' == '2اس\u061c-1'
E         - 2اس؜-1
E         ?  ^
E         + 2أس؜-1
E         ?  ^

tests/test_numbers.py:692: AssertionError
______________________ TestFormat.test_format_scientific _______________________

self = <tests.test_support.TestFormat object at 0x7f4027595650>

    def test_format_scientific(self):
        assert support.Format('en_US').scientific(10000) == '1E4'
        assert support.Format('en_US').scientific(Decimal("10000")) == '1E4'
>       assert support.Format('ar_EG', numbering_system="default").scientific(10000) == '1اس4'
E       AssertionError: assert '1أس4' == '1اس4'
E         - 1اس4
E         ?  ^
E         + 1أس4
E         ?  ^

tests/test_support.py:348: AssertionError

Expected Results

All tests should pass

Additional Information

Version info:

python3: 3.12.1
pytest: 7.4.4
tz: 2023.3.post1-2
freezegun: 1.2.1
unicode-cldr-core: 44-0.1
tzdata: 2023d-1

@nileshpatra nileshpatra changed the title Test failures on bable version 2.14.0 Test failures on babel version 2.14.0 Jan 21, 2024
@Alex-ley-scrub
Copy link

I noticed some similar issues with ssf: snoopyjc/ssf#17

stemming from: https://github.com/python-babel/babel/releases/tag/v2.14.0

Locale.number_symbols will now have first-level keys for each numbering system. Since the implicit default numbering system still is "latn", what had previously been e.g. Locale.number_symbols['decimal'] is now Locale.number_symbols['latn']['decimal'].

@akx
Copy link
Member

akx commented Mar 4, 2024

@nileshpatra Considering all of our tests are green on master here, sounds like the Debian build is doing something differently. Can you share some verbose logs or such?

@Alex-ley-scrub That's unrelated – but as mentioned in the changelog, the format of .number_symbols has changed from 2.13 to 2.14 to allow for other numbering systems than Latin. number_symbols's documentation has had an admonition that the format may change between Babel versions since 2016, and now it did 😄

@nileshpatra
Copy link
Author

Hi @akx

Considering all of our tests are green on master here, sounds like the Debian build is doing something differently.

I suspect this has got something to do with tzdata version and the changes thereof. Is is possible to know what version of tzdata the CI pulls in?
In debian it is 2023d-1 for the log that I linked to.

Can you share some verbose logs or such?

Will py.test-3 --verbose help you here?

@akx
Copy link
Member

akx commented Mar 5, 2024

As far as I can see, none of the errors above should be related to tzdata, but the CLDR data. Are you sure you're pulling and converting the correct CLDR data (make import-cldr)?

@nileshpatra
Copy link
Author

I think so - we are using babel's tarball directly off github releases which has .dat files processed already. We don't have to pull and convert at our end - do we?

@akx
Copy link
Member

akx commented Mar 5, 2024

@nileshpatra Um... what tarball is that? The GitHub release for 2.14.0 has no sdist TAR.

@nileshpatra
Copy link
Author

@akx oops, seems like I gave an incorrect response w/o properly checking - sorry for that! You're right indeed, there's no sdist.

In debian, we generate .dat files via: python3 scripts/import_cldr.py /usr/share/unicode/cldr/common and the version of unicode-cldr-core in debian is 44.0 while babel pulls 43.0 as per

https://github.com/python-babel/babel/blob/master/scripts/download_import_cldr.py#L12

I suppose this is the difference -- do you think babel can be adapted to latest CLDR data?

@akx
Copy link
Member

akx commented Mar 6, 2024

@nileshpatra Sure, the work can be done to have Babel use CLDR 44, but that would be for Babel 2.15. Babel 2.14 uses CLDR 43 (#1043).

@nileshpatra
Copy link
Author

Ack, I will wait for a new release then

@akx
Copy link
Member

akx commented May 5, 2024

The freshly released Babel 2.15.0 uses CLDR 44. 🎉

The next version will use CLDR 45 when #1077 gets merged.

@akx akx closed this as completed May 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants