Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeDecodeError can happen when passing objects as msg on python2 #107

Open
perrinjerome opened this issue Nov 7, 2021 · 0 comments
Open

Comments

@perrinjerome
Copy link

First of all, this is a problem only affecting python2.7, so if you think it's no longer relevant, don't hesitate to close this issue.

I thought that you might still be interested in this, because this package still claim to support python 2 in setup.py's classifiers and this is a scenario which works fine with standard logging, but breaks when coloredlogs is used. On python3 this problem does not happen.

Logging common API, described at https://docs.python.org/2/library/logging.html#logging.debug is:

logging.debug(msg[, *args[, **kwargs]])

the most common usage is to pass a string as msg, but as we can see in https://docs.python.org/3/howto/logging.html#using-arbitrary-objects-as-messages it's also supported to pass arbitrary objects as msg and their __str__ method will be used to convert the objects to string.

When using python logging module, this works fine, even when the string contain non ascii characters, for example:

# coding: utf-8
class O:
  def __str__(self):
    return "💥"

import logging
logging.basicConfig()
logging.getLogger().critical(O())

correctly output:

CRITICAL:root:💥

but when coloredlogs is used, like with this example:

import coloredlogs
coloredlogs.install()
logging.getLogger().critical(O())

an UnicodeDecodeError is raised:

Traceback (most recent call last):
  File "../lib/python2.7/logging/__init__.py", line 868, in emit
    msg = self.format(record)
  File "../lib/python2.7/logging/__init__.py", line 741, in format
    return fmt.format(record)
  File "../coloredlogs/__init__.py", line 1137, in format
    copy.msg = ansi_wrap(coerce_string(record.msg), **style)
  File "../lib/python2.7/site-packages/humanfriendly/compat.py", line 119, in coerce_string
    return value if is_string(value) else unicode(value)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xf0 in position 0: ordinal not in range(128)

This happened with humanfriendly==10.0, coerce_string is defined as https://github.com/xolox/python-humanfriendly/blob/6758ac61f906cd8528682003070a57febe4ad3cf/humanfriendly/compat.py#L101-L108

Maybe this can be addressed in humanfriendly, by making coerce_string trying harder to decode the string, maybe something like this, because most strings are UTF-8 anyway:

def coerce_string(value):
    if sys.version_info < (3,):
        # If value define `__unicode__`, use this directly. If it does not
        # and `__str__` returns bytes that can not be decoded to unicode,
        # then use `__str__` and decode.
        try:
            value = unicode(value)
        except UnicodeDecodeError:
            value = unicode(str(value), 'utf-8', 'replace')
    return value if is_string(value) else unicode(value)
NexediGitlab pushed a commit to SlapOS/slapos.core that referenced this issue Nov 7, 2021
As reported on https://gi
thub.com/xolox/python-coloredlogs/issues/107
logging objects with a __str__ method returning non-ascii characters
raises UnicodeDecodeError.

We have vendored coloredlogs version 0.5 long time ago, so just
apply the suggested fix here for now.
NexediGitlab pushed a commit to SlapOS/slapos.core that referenced this issue Nov 8, 2021
As reported on xolox/python-coloredlogs#107
logging objects with a __str__ method returning non-ascii characters
raises UnicodeDecodeError.

We have vendored coloredlogs version 0.5 long time ago, so just
apply the suggested fix here for now.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant