Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UnicodeEncodeError: 'charmap' codec can't encode character '\u258f' in position 6: character maps to <undefined> #1339

Open
5 of 6 tasks
BramVanroy opened this issue Jun 30, 2022 · 2 comments

Comments

@BramVanroy
Copy link

I'm having the following issue that I fail to understand (then again, encoding issues often confuse me). I'm on Windows and I use the "new" Terminal and I also tried PowerShell 7. I find that I sometimes get an error, through tqdm (cf. related), particularly when a small-width progress bar is shown and this character needs to be shown: (\u258f).

This is the error trace that I get

  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\tqdm\std.py", line 1256, in update
    self.refresh(lock_args=self.lock_args)
  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\tqdm\std.py", line 1361, in refresh
    self.display()
  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\tqdm\std.py", line 1509, in display
    self.sp(self.__str__() if msg is None else msg)
  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\tqdm\std.py", line 350, in print_status
    fp_write('\r' + s + (' ' * max(last_len[0] - len_s, 0)))
  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\tqdm\std.py", line 343, in fp_write
    fp.write(_unicode(s))
  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\tqdm\utils.py", line 145, in inner
    return func(*args, **kwargs)
  File "C:\Users\bramv\.virtualenvs\transformers-finetuner-ah-81wJc\lib\site-packages\ray\tune\utils\util.py", line 228, in write
    self.stream2.write(*args, **kwargs)
  File "C:\Users\bramv\AppData\Local\Programs\Python\Python38\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u258f' in position 6: character maps to <undefined>

The reason why I find this odd is, that I thought these issues were fixed in PEP 528 which changed Windows console encoding to UTF-8. I also see this in my interpreter:

>>> "\u258f"
'▏'

so that should just work. So I am not sure why I am getting the error during the running of the script.

  • I have marked all applicable categories:
    • exception-raising bug
    • visual output bug
  • I have visited the [source website], and in particular
    read the [known issues]
  • I have searched through the [issue tracker] for duplicates
  • I have mentioned version numbers, operating system and
    environment, where applicable: 4.64.0 3.8.8 (tags/v3.8.8:024d805, Feb 19 2021, 13:18:16) [MSC v.1928 64 bit (AMD64)] win32
@tlp19
Copy link

tlp19 commented Aug 8, 2022

I have a similar issue when running a GitHub workflow on a Windows Server machine (windows-latest).

It didn't happen before, but recently I get a similar error when I print a TQDM progress bar to the console.

Are there any updates/feedback on this?

@Prettyyyyyyyyy
Copy link

This is not a tqdm issue. Encountered the same issue using transformers. Solution is to set the environment variable to force UTF8 encode. See details here: https://peps.python.org/pep-0540/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants