Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

img formatter renders text at wrong position (on Windows?) #1213

Closed
Anteru opened this issue Aug 31, 2019 · 5 comments · Fixed by #1611
Closed

img formatter renders text at wrong position (on Windows?) #1213

Anteru opened this issue Aug 31, 2019 · 5 comments · Fixed by #1611
Labels
A-formatting area: changes to formatters S-minor severity: minor T-bug type: a bug X-imported imported from Bitbucket
Milestone

Comments

@Anteru
Copy link
Collaborator

Anteru commented Aug 31, 2019

(Original issue 1509 created by hhsprings on 2019-04-26T01:11:55.667733+00:00)

Note that I have reproduced this problem only on Windows (Japanese edition).

In the case where the lexer is other than "text", the drawing position of tokens other than the first one appearing on the line is wrong.

If you change the passing text to getsize in the "get_char_size" method of class FontManager from "M" to "a", it will draw in the correct position, but I can't explain what's happened. Apart from this mysterious fix method, I think that in the first place the approach adopted does not work when it contains kanji characters.

Maybe I think this is the conclusion:

  • Don't multiply the font size by the character position.
  • Use the size of all text before the drawing target token in the line.

used versions:

  • python 2.7, 3.5
  • pygments 2.3.1
  • pillow 4.2.1
  • Windows 7
@Anteru Anteru added T-bug type: a bug X-imported imported from Bitbucket S-minor severity: minor labels Aug 31, 2019
@Anteru
Copy link
Collaborator Author

Anteru commented Aug 31, 2019

(Original comment by hhsprings on 2019-04-26T06:06:48.733712+00:00)

Sorry to attach the modified source code directly. I got lost about how to do PR with BitBucket + Mercurial.

@birkenfeld birkenfeld added the A-formatting area: changes to formatters label Nov 25, 2019
@15b3
Copy link
Contributor

15b3 commented Nov 26, 2020

I faced a similar problem.

I outputed the image by using ImgFormatter from this sample code containing Japanese characters:

# sample.py

print("Hello, World!")
print("こんにちは、世界!")

Outputed Image:
sample_before
The position of the ") is wrong. And, size of the image is too small.

As hhsprings suggests, this may be due to calculating the start position as the number of characters multiplied by the number of widths of the letter "M", especially when the character widths are different.
In this case, it is Japanese font, but I believe the same is possible in Chinese as well. #1558

This can be solved by using the get_size method of the PIL.ImageFont.ImageFont class and drawing the token from the calculated position for each token.

How about a fix like this one? ( 15b3/pygments@2bdcf50 )

the result image (fixed):
sample_after

@Anteru
Copy link
Collaborator Author

Anteru commented Nov 26, 2020

That does sound reasonable -- can you please file a PR for it? Does this have significant drawbacks in terms of performance, as it needs one call per line now?

@15b3
Copy link
Contributor

15b3 commented Nov 26, 2020

PR is OK.

The calculation cost will be greater.
To be honest, I'm not sure how much. (I don't have enough familiarity with Pillow's method.)

So I measured the time it took to converting 1000 lines of code on my iMac.

import time

from pygments import highlight
from pygments.lexers import PythonLexer
from pygments.formatters import ImageFormatter


if __name__ == "__main__":
    # large code preparation
    max_lines = 1000
    line_sample = 'print("あああ", "いいい", "ううう", "えええ", "おおお", "かかか", "ききき", "くくく")'

    code = "\n".join([line_sample for i in range(max_lines)])

    # start
    start = time.time()

    highlight(code, PythonLexer(), ImageFormatter(font_name="ipag"))

    # end print
    print(time.time() - start, "[sec]")

Result (twice each):

# on master branch
$ python3 time_large_sample.py 
4.8635358810424805 [sec]
$ python3 time_large_sample.py
4.772355079650879 [sec]

$ git checkout linelength 
Switched to branch 'linelength'
$ python3 time_large_sample.py
5.931611061096191 [sec]
$ python3 time_large_sample.py
5.878076791763306 [sec]

@Anteru
Copy link
Collaborator Author

Anteru commented Nov 28, 2020

Thanks a lot! That's a 20% slowdown, but given I don't see a better option for handling this, and it's not a quick process to start with, let's go with that for now.

@Anteru Anteru added the changelog-update Items which need to get mentioned in the changelog label Nov 28, 2020
@Anteru Anteru added this to the 2.7.3 milestone Nov 28, 2020
@Anteru Anteru removed the changelog-update Items which need to get mentioned in the changelog label Dec 5, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-formatting area: changes to formatters S-minor severity: minor T-bug type: a bug X-imported imported from Bitbucket
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants