Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not initialize a catalog from a template with an empty string value for "Language" header #665

Open
sinoroc opened this issue Oct 7, 2019 · 2 comments · May be fixed by #666
Open

Comments

@sinoroc
Copy link

sinoroc commented Oct 7, 2019

GNU gettext's xgettext and lingua's pot-create both create PO-template files containing a Language header with an empty string as its value ("Language: \n").

Babel does not seem to be able to handle this empty header and produces the error and stacktrace below.

Removing the Language header from the template file allows Babel to initialize the catalogs without issue.

  • Babel 2.7.0
  • lingua 4.13
  • xgettext 0.19.8.1
  • Python 3.6.8
python3 setup.py init_catalog --domain project --input src/project/locale/project.pot --output-file src/project/locale/en/LC_MESSAGES/project.po --locale en
running init_catalog
creating catalog src/project/locale/en/LC_MESSAGES/project.po based on src/project/locale/project.pot
Traceback (most recent call last):
  File "setup.py", line 35, in <module>
    _do_setup()
  File "setup.py", line 30, in _do_setup
    version=version,
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/setuptools/__init__.py", line 145, in setup
    return distutils.core.setup(**attrs)
  File "/usr/lib/python3.6/distutils/core.py", line 148, in setup
    dist.run_commands()
  File "/usr/lib/python3.6/distutils/dist.py", line 955, in run_commands
    self.run_command(cmd)
  File "/usr/lib/python3.6/distutils/dist.py", line 974, in run_command
    cmd_obj.run()
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/frontend.py", line 622, in run
    catalog = read_po(infile, locale=self.locale)
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/pofile.py", line 377, in read_po
    parser.parse(fileobj)
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/pofile.py", line 308, in parse
    self._process_comment(line)
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/pofile.py", line 267, in _process_comment
    self._finish_current_message()
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/pofile.py", line 204, in _finish_current_message
    self._add_message()
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/pofile.py", line 198, in _add_message
    self.catalog[msgid] = message
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/catalog.py", line 629, in __setitem__
    self.mime_headers = _parse_header(message.string).items()
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/catalog.py", line 430, in _set_mime_headers
    self._set_locale(value)
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/messages/catalog.py", line 318, in _set_locale
    self._locale = Locale.parse(locale)
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/core.py", line 268, in parse
    parts = parse_locale(identifier, sep=sep)
  File "/home/sinoroc/workspace/project/.tox/develop/lib/python3.6/site-packages/babel/core.py", line 1094, in parse_locale
    raise ValueError('expected only letters, got %r' % lang)
ValueError: expected only letters, got ''
@sinoroc
Copy link
Author

sinoroc commented Oct 7, 2019

Looks like the Catalog used to represent the PO-template gets its locale rewritten once the parser reaches the Language header. An easy fix would be to simply ignore this header if a template is being processed.

@sinoroc
Copy link
Author

sinoroc commented Oct 7, 2019

This is a bit puzzling...

On one side:

with open(self.input_file, 'rb') as infile:
# Although reading from the catalog template, read_po must be fed
# the locale in order to correctly calculate plurals
catalog = read_po(infile, locale=self.locale)

On the other:

def read_po(fileobj, locale=None, domain=None, ignore_obsolete=False, charset=None, abort_invalid=False):
"""Read messages from a ``gettext`` PO (portable object) file from the given
file-like object and return a `Catalog`.
>>> from datetime import datetime
>>> from babel._compat import StringIO
>>> buf = StringIO('''
... #: main.py:1
... #, fuzzy, python-format
... msgid "foo %(name)s"
... msgstr "quux %(name)s"
...
... # A user comment
... #. An auto comment
... #: main.py:3
... msgid "bar"
... msgid_plural "baz"
... msgstr[0] "bar"
... msgstr[1] "baaz"
... ''')
>>> catalog = read_po(buf)
>>> catalog.revision_date = datetime(2007, 4, 1)
>>> for message in catalog:
... if message.id:
... print((message.id, message.string))
... print(' ', (message.locations, sorted(list(message.flags))))
... print(' ', (message.user_comments, message.auto_comments))
(u'foo %(name)s', u'quux %(name)s')
([(u'main.py', 1)], [u'fuzzy', u'python-format'])
([], [])
((u'bar', u'baz'), (u'bar', u'baaz'))
([(u'main.py', 3)], [])
([u'A user comment'], [u'An auto comment'])
.. versionadded:: 1.0
Added support for explicit charset argument.
:param fileobj: the file-like object to read the PO file from
:param locale: the locale identifier or `Locale` object, or `None`
if the catalog is not bound to a locale (which basically
means it's a template)

and
class Catalog(object):
"""Representation of a message catalog."""
def __init__(self, locale=None, domain=None, header_comment=DEFAULT_HEADER,
project=None, version=None, copyright_holder=None,
msgid_bugs_address=None, creation_date=None,
revision_date=None, last_translator=None, language_team=None,
charset=None, fuzzy=True):
"""Initialize the catalog object.
:param locale: the locale identifier or `Locale` object, or `None`
if the catalog is not bound to a locale (which basically
means it's a template)

sinoroc added a commit to sinoroc/babel that referenced this issue Oct 7, 2019
The tools 'xgettext' from GNU gettext and 'pot-create' from lingua
both produce PO-template files containing a 'Language' header with
an empty string as value.

Babel has issues creating catalogs out of such templates.

It wants to parse the PO-template as a normal PO-catalog using the
locale of the target catalog, because it helps producing the correct
plurals. But it overwrites this locale when parsing the 'Language'
header and then it can not produce a valid catalog.

Ignoring the 'Language' header when it is known that the file being
processed is a template avoids this issue.

GitHub: python-babel#665
@sinoroc sinoroc linked a pull request Oct 7, 2019 that will close this issue
sinoroc added a commit to sinoroc/babel that referenced this issue Mar 26, 2020
The tools 'xgettext' from GNU gettext and 'pot-create' from lingua
both produce PO-template files containing a 'Language' header with
an empty string as value.

Babel has issues creating catalogs out of such templates.

It wants to parse the PO-template as a normal PO-catalog using the
locale of the target catalog, because it helps producing the correct
plurals. But it overwrites this locale when parsing the 'Language'
header and then it can not produce a valid catalog.

Ignoring the 'Language' header when it is known that the file being
processed is a template avoids this issue.

GitHub: python-babel#665
sinoroc added a commit to sinoroc/babel that referenced this issue Mar 20, 2023
The tools 'xgettext' from GNU gettext and 'pot-create' from lingua
both produce PO-template files containing a 'Language' header with
an empty string as value.

Babel has issues creating catalogs out of such templates.

It wants to parse the PO-template as a normal PO-catalog using the
locale of the target catalog, because it helps producing the correct
plurals. But it overwrites this locale when parsing the 'Language'
header and then it can not produce a valid catalog.

Ignoring the 'Language' header when it is known that the file being
processed is a template avoids this issue.

GitHub: python-babel#665
sinoroc added a commit to sinoroc/babel that referenced this issue Mar 29, 2023
The tools 'xgettext' from GNU gettext and 'pot-create' from lingua
both produce PO-template files containing a 'Language' header with
an empty string as value.

Babel has issues creating catalogs out of such templates.

It wants to parse the PO-template as a normal PO-catalog using the
locale of the target catalog, because it helps producing the correct
plurals. But it overwrites this locale when parsing the 'Language'
header and then it can not produce a valid catalog.

Ignoring the 'Language' header when it is known that the file being
processed is a template avoids this issue.

GitHub: fixes python-babel#665
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants