Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

naturaltime: wrong translation with delta >= 2 years #21

Closed
Mathieu-Ghaleb opened this issue Jun 16, 2022 · 6 comments · Fixed by #23
Closed

naturaltime: wrong translation with delta >= 2 years #21

Mathieu-Ghaleb opened this issue Jun 16, 2022 · 6 comments · Fixed by #23
Labels
bug Something isn't working

Comments

@Mathieu-Ghaleb
Copy link

Mathieu-Ghaleb commented Jun 16, 2022

What did you do?

Ran the following code:

import humanize
import datetime as dt

humanize.i18n.activate("fr_FR")
output = humanize.naturaltime(dt.datetime(year=2010, month=1, day=1))
print(output)

What did you expect to happen?

Output to be:

il y a 12 ans

What actually happened?

Output was:

il y a 12 years

What versions are you using?

  • OS: Mac OS
  • Python: 3.8.6
  • Humanize: 4.0.1
@hugovk hugovk added the bug Something isn't working label Jun 17, 2022
@hugovk
Copy link
Member

hugovk commented Jun 17, 2022

Here's a more direct reproducer. The first three asserts are fine, but the last fails because it get "12 years" not "12 ans":

import humanize

output = humanize.naturaldelta(1 * 365 * 24 * 60 * 60)
print(output)
assert output == "a year"

output = humanize.naturaldelta(12 * 365 * 24 * 60 * 60)
print(output)
assert output == "12 years"


humanize.i18n.activate("fr_FR")

output = humanize.naturaldelta(1 * 365 * 24 * 60 * 60)
print(output)
assert output == "un an"

output = humanize.naturaldelta(12 * 365 * 24 * 60 * 60)
print(output)  # "12 years"
assert output == "12 ans"

This was introduced in d1faf1c from PR jmoiron/humanize#246:

Instead of passing an integer to the _ngettext localisation function we're passing in the results of intcomma(years) which is a string:

-        return _ngettext("%d year", "%d years", years) % years
+        return _ngettext("%s year", "%s years", years) % intcomma(years)

And also I think it can't find "%s year", "%s years" values because they use %d in the translation files:

#: src/humanize/time.py:187
#, python-format
msgid "%d year"
msgid_plural "%d years"
msgstr[0] "%d an"
msgstr[1] "%d ans"

Ping @carterbox, please could you have a look at this?

@carterbox
Copy link
Contributor

I checked out version 4.2.0, and I cannot run this reproducer.

Traceback (most recent call last):
  File "/home/dching/Documents/humanize/comma.py", line 15, in <module>
    humanize.i18n.activate("fr_FR")
  File "/home/dching/Documents/humanize/src/humanize/i18n.py", line 62, in activate
    translation = gettext_module.translation("humanize", path, [locale])
  File "/home/dching/miniconda3/envs/humanize/lib/python3.10/gettext.py", line 592, in translation
    raise FileNotFoundError(ENOENT,
FileNotFoundError: [Errno 2] No translation file found for domain: 'humanize'

What am I doing wrong?

@carterbox
Copy link
Contributor

carterbox commented Jun 20, 2022

The docs say that this "gettext.translation" function only looks for ".mo" files, but the source only contains ".po" files. I assume there is some autoconversion supposed to be happening.

@carterbox
Copy link
Contributor

OK. I grabbed the compiled translation files from the conda tar.

@carterbox
Copy link
Contributor

carterbox commented Jun 20, 2022

The general translation workflow seems to be:

  1. generate possible format strings
  2. translate the unevaluated format strings
  3. evaluate the format strings.

When I added the intcomma feature for years in the time module, I didn't properly test step 2 (translate the format string). Since the translation works directly on the format string and the translation phrases include the formatters (probably because in some locales the order is changed), switching from %d to %s broke the translation step. This means the translation is broken for all translated languages.

Notably, this translation error does not occur in precisedelta() because the translation and format evaluation occur on multiple lines, so the %d are swapped for %s after translation.

elif unit == YEARS:
fmt_txt = fmt_txt.replace("%d", "%s")
texts.append(fmt_txt % intcomma(fmt_value))
continue

So the solution would be to match naturaldelta() with precisedelta() and swap the formatter after the translation.

I don't want to change the translation files or swap the formatters in other places because using %d instead of %s may be used to do rounding or truncation of floating point numbers.

carterbox added a commit to carterbox/humanize that referenced this issue Jun 20, 2022
This patch fixes a bug introduced in 3.14.0, where the format
string was changed from %d to %s to add separators to the year.
However, this needs to happen after translation because the
translator uses the format strings as part of the translation.

Closes python-humanize#21
carterbox added a commit to carterbox/humanize that referenced this issue Jun 20, 2022
This patch fixes a bug introduced in 3.14.0, where the format
string was changed from %d to %s to add separators to the year.
However, this needs to happen after translation because the
translator uses the format strings as part of the translation.

Closes python-humanize#21
@hugovk
Copy link
Member

hugovk commented Jun 20, 2022

For future reference, there's a script to generate .mo files. Mentioned in the release checklist:

# Generate translation binaries
scripts/generate-translation-binaries.sh

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants