Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix indices of colors read from GIMP Palette file #6640

Open
wants to merge 17 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 8 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
18 changes: 9 additions & 9 deletions Tests/images/custom_gimp_palette.gpl
@@ -1,12 +1,12 @@
GIMP Palette
Name: custompalette
Columns: 4
#
0 0 0 Index 3
65 38 30 Index 4
103 62 49 Index 6
79 73 72 Index 7
114 101 97 Index 8
208 127 100 Index 9
151 144 142 Index 10
221 207 199 Index 11
# Original written by David Wetz in https://stackoverflow.com/questions/815836/im-creating-a-program-that-generates-a-palette-from-a-true-color-image-need-hel/815855#815855
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an explanation of what is happening here - this isn't a random attribution. You're creating a comment that is over 100 characters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comments with >100 char lines long should not "break" the file. Even if we are reformatting here, the URL alone is wider than 100 chars in this case, so even breaking this in a functional way would still have lines wider than that.

0 0 0 Index 0
65 38 30 Index 1
103 62 49 Index 2
79 73 72 Index 3
114 101 97 Index 4
208 127 100 Index 5
151 144 142 Index 6
221 207 199 Index 7
39 changes: 39 additions & 0 deletions Tests/test_file_gimppalette.py
Expand Up @@ -20,6 +20,22 @@ def test_sanity():
GimpPaletteFile(fp)


def test_large_file_is_truncated():
import warnings

try:
original_value = GimpPaletteFile._max_file_size
GimpPaletteFile._max_file_size = 100
with warnings.catch_warnings():
warnings.simplefilter("error")
with pytest.raises(UserWarning):
with open("Tests/images/custom_gimp_palette.gpl", "rb") as fp:
GimpPaletteFile(fp)

finally:
GimpPaletteFile._max_file_size = original_value


def test_get_palette():
# Arrange
with open("Tests/images/custom_gimp_palette.gpl", "rb") as fp:
Expand All @@ -29,4 +45,27 @@ def test_get_palette():
palette, mode = palette_file.getpalette()

# Assert
expected_palette = b""
for color in (
(0, 0, 0),
(65, 38, 30),
(103, 62, 49),
(79, 73, 72),
(114, 101, 97),
(208, 127, 100),
(151, 144, 142),
(221, 207, 199),
):
expected_palette += bytes(color)
assert palette == expected_palette
assert mode == "RGB"


def test_n_colors():
# Arrange
with open("Tests/images/custom_gimp_palette.gpl", "rb") as fp:
palette_file = GimpPaletteFile(fp)

palette, _ = palette_file.getpalette()
assert len(palette) == 24
jsbueno marked this conversation as resolved.
Show resolved Hide resolved
assert palette_file.n_colors == 8
39 changes: 28 additions & 11 deletions src/PIL/GimpPaletteFile.py
Expand Up @@ -15,6 +15,7 @@
#

import re
import warnings

from ._binary import o8

Expand All @@ -24,33 +25,49 @@ class GimpPaletteFile:

rawmode = "RGB"

def __init__(self, fp):
#: override if reading larger palettes is needed
max_colors = 256
_max_line_size = 100
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're reading RGB data from each line. That's three integers. Under what circumstances would a user need to read more than 100 characters?

Copy link
Contributor Author

@jsbueno jsbueno Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 4th Column contains the color name, though that is currently unused. And there might be a name with more than 100 characters - moving the number to an attribute just offers a chance of one working around an error imposed by an arbitrary, though reasonable, limit on an otherwise good file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(currently unused in PIL, that is: GIMP uses that name)

_max_file_size = 2**20

self.palette = [o8(i) * 3 for i in range(256)]
def __init__(self, fp):

if fp.readline()[:12] != b"GIMP Palette":
raise SyntaxError("not a GIMP palette file")

for i in range(256):
read = 0

s = fp.readline()
self.palette = b""
while len(self.palette) < 3 * self.max_colors:

s = fp.readline(self._max_file_size)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
s = fp.readline(self._max_file_size)
s = fp.readline(self._max_line_size)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a mistake, where you've accidentally used the file size limit for the line size limit for the size argument.

Copy link
Contributor Author

@jsbueno jsbueno Oct 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no - "file" size is deliberate here - to avoid DoS only.
line size checks are made just below, and as far as I know it is something just PIL has - there is no spec for this kind of file, but for GIMP source code, and there is no line size limit there.

Changing here for reading linesize would make the file break with long comments (as it did), and that makes no sense:

if one has a python project in which Palettes with this format are used, there is no reason to impose an arbitrary limit on comments on that source code. Resilience against malicious files when those are input by 3rd parties are mitigated through the file-size check.

if not s:
break

read += len(s)
if read >= self._max_file_size:
warnings.warn(
f"Palette file truncated at {self._max_file_size - len(s)} bytes"
)
break

# skip fields and comment lines
if re.match(rb"\w+:|#", s):
continue
if len(s) > 100:
if len(s) > self._max_line_size:
raise SyntaxError("bad palette file")

v = tuple(map(int, s.split()[:3]))
if len(v) != 3:
# 4th column is color name and may contain spaces.
v = s.split(None, 4)
jsbueno marked this conversation as resolved.
Show resolved Hide resolved
if len(v) < 3:
raise ValueError("bad palette entry")

self.palette[i] = o8(v[0]) + o8(v[1]) + o8(v[2])

self.palette = b"".join(self.palette)
for i in range(3):
self.palette += o8(int(v[i]))

def getpalette(self):

return self.palette, self.rawmode

@property
def n_colors(self):
return len(self.palette) / 3
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I've said, i don't think that n_colors is necessary. If the user is going to use the palette string, then they have enough of an understanding to be able to determine how many colors there are. This is a simple operation without an obvious need, so I'd like to avoid adding to the API for a trivial case.

But if you disagree, you can leave it here and see what others think.

Copy link
Contributor Author

@jsbueno jsbueno Oct 7, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will remove it. However, I definitely like Python things to be "pythonic". That include bring able to know about object states through properties, usingagic.methods so users can comfortably use operators and common interfaces to deal with their data.

In my experience these things make libraries more joyful to use and easier to maintain.

Pillow Is a project that tends to the opposite: a lot of functional things and magic global states. I think part of that is due to historic reasons than a philosophy of the project, and hope to offer more pythonic interfaces like I had indicated in the chat about animated images.

That said, I'd say these comfortable properties could be left for a revamp of the ImagePalette class at a future time, and there is no need for the new property here.

This will also give time for all involved to understand and have an opinion if this "first class classes" approach is desirable at all, and draw, projectwise, a more solid roadmap.

(As far as I am concerned wanting to know the number of colors of a palette is an operation common enough and with nothing of unexpected - akim to getting a strong length - and requiring one to deal with bytes when their problem is dealing with colors is a quasi definition of "unpythonic". However, if this class is to be instrumented,.maybe this could be offered as it's __len__ for once, instead of creating an arbitrarily named property.)