Added `image.entropy()` method #3530

fish2000 · 2019-01-02T22:11:51Z

This calculates the entropy for the image, based on the histogram.

Because this uses image histogram data directly, the existing C function underpinning the image.histogram() method was abstracted into a macro, and a new C function was added that uses this macro.

The new image.entropy() method is based on image.histogram(), and will accept the same arguments to calculate the histogram data it will use to assess the entropy of the image.

The algorithm and methodology is based on existing Python code:

https://github.com/fish2000/instakit/blob/44893861972b16e279ad0fe3f9dc1f84b39004b4/instakit/processors/squarecrop.py#L11-L25

... A test case in the Tests/ directory, and doctest lines in selftest.py, have both been added and checked.

Changes proposed in this pull request:

The addition of an image.entropy() method,
The abstraction of the prologue of the C function _histogram into a macro, and
The use of that macro in both _histogram and _entropy.
Minor documentation addenda in the docstrings for both image.entropy() and image.histogram()

src/PIL/Image.py

fish2000 · 2019-01-02T22:18:23Z

The tests are failing, due to float-comparison issues… I will rewrite the test code to deal with that and re-push.

fish2000 · 2019-01-02T22:38:01Z

I’m on Mac OS X, and I have no idea how to fix the Appveyor build scripts – can someone who knows windows help me out, or point me in the right direction?

Evidently it needs to link the C math library when linking the result of compiling _imaging.c in order to find the symbol for the _log2 function call.

This calculates the entropy for the image, based on the histogram. Because this uses image histogram data directly, the existing C function underpinning the `image.histogram()` method was abstracted into a macro, and a new C function was added that uses this macro. The new `image.entropy()` method is based on `image.histogram()`, and will accept the same arguments to calculate the histogram data it will use to assess the entropy of the image. The algorithm and methodology is based on existing Python code: https://git.io/fhmIU ... A test case in the `Tests/` directory, and doctest lines in `selftest.py`, have both been added and checked. Subsequent commits: * Using assertAlmostEqual() in entropy tests * Added description of `extrema` arguments. * Only test seven digits of float returned by im.entropy()

radarhere · 2019-01-02T23:42:31Z

I have created fish2000#1

Removed log2

homm · 2019-01-03T14:47:06Z

Any reason to implement it in C rather than in Python? I'm comparing with this implementation:

def entr(hist):
    from math import log
    fs = 1.0 / sum(hist)
    fentropy = 0.0
    for h in hist:
        if h != 0:
            h = h * fs
            fentropy += h * log(h, 2)
    return -fentropy

For 1024 × 640 image I get:

In: %timeit im.histogram()
100 loops, best of 3: 2.04 ms per loop

In: %timeit im.entropy()
100 loops, best of 3: 2.07 ms per loop

In: %timeit entr(im.histogram())
100 loops, best of 3: 2.31 ms per loop

The bigger image, the smaller the difference between Python and C implementation. Does it worth it?

src/_imaging.c

fish2000 · 2019-01-03T16:14:05Z

The bigger image, the smaller the difference between Python and C implementation. Does it worth it?

Short answer: yes, I believe so.

Many histogram-entropy functions are implemented in Python, using the PIL/Pillow histogram() method data – e.g.

The ever-popular easy-thumbnails Django module;
My own version, in my instakit package;
A popular snippet, freely available;
The reference implementation from SciPy;
An attempt to squeeze more speed from the same methodology, using the NumPy C-API;

One of PIL/Pillow’s primary uses is in web applications – processing images uploaded by users, avatar images, favicon.ico source material, minute changes to a minuscule images’ scale calculated on-the-fly, and so forth. In-memory, ad-hoc performance with smaller images is important in these cases.

Here’s the long answer: the case I would make for moving this algorithm into Pillow’s _imaging.c, where it can take full advantage of Pillow’s internal structures, is that the version I have submitted heap-allocates only one PyObject per invocation. Comparable Python implementations need to create at least one Python object for each entry in the histogram – 768 for an RGB-mode image – and add at least one frame to the interpreter stack, in the best possible scenario.

Using a C function, utilizing the Pillow internal structures (e.g. ImagingHistogramInstance) in an application where the function is iteratively called across 1-dimensional image slices, makes a big difference in the context of a web application, with their attendant expectations of efficient concurrency. That’s my take on it, at any rate, and my original motivation, basically. Yes!

fish2000 · 2019-01-25T17:49:43Z

This PR has been superseded by #3608, which uses @homm's static extrema-parsing function code instead of a prologue macro.

radarhere reviewed Jan 2, 2019

View reviewed changes

src/PIL/Image.py Show resolved Hide resolved

radarhere added the Enhancement label Jan 2, 2019

fish2000 force-pushed the image-entropy-method branch 2 times, most recently from 5300972 to a3b6bd4 Compare January 2, 2019 22:32

fish2000 force-pushed the image-entropy-method branch from a3b6bd4 to 4ce620c Compare January 2, 2019 23:11

Removed log2

2ec7484

radarhere mentioned this pull request Jan 2, 2019

Removed log2 fish2000/Pillow#1

Merged

Merge pull request #1 from radarhere/image-entropy-method

6cc7a99

Removed log2

homm reviewed Jan 3, 2019

View reviewed changes

src/_imaging.c Show resolved Hide resolved

fish2000 mentioned this pull request Jan 25, 2019

Added an image.entropy() method (second revision) #3608

Merged

fish2000 closed this Jan 25, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added `image.entropy()` method #3530

Added `image.entropy()` method #3530

fish2000 commented Jan 2, 2019 •

edited by radarhere

fish2000 commented Jan 2, 2019

fish2000 commented Jan 2, 2019

radarhere commented Jan 2, 2019

homm commented Jan 3, 2019

fish2000 commented Jan 3, 2019 •

edited

fish2000 commented Jan 25, 2019

Added image.entropy() method #3530

Added image.entropy() method #3530

Conversation

fish2000 commented Jan 2, 2019 • edited by radarhere

fish2000 commented Jan 2, 2019

fish2000 commented Jan 2, 2019

radarhere commented Jan 2, 2019

homm commented Jan 3, 2019

fish2000 commented Jan 3, 2019 • edited

fish2000 commented Jan 25, 2019

Added `image.entropy()` method #3530

Added `image.entropy()` method #3530

fish2000 commented Jan 2, 2019 •

edited by radarhere

fish2000 commented Jan 3, 2019 •

edited