Avoid unstable nature of qsort in Quant.c #5367

radarhere · 2021-03-30T22:17:32Z

Resolves #5263

https://en.wikipedia.org/wiki/Quicksort

Efficient implementations of Quicksort are not a stable sort, meaning that the relative order of equal sort items is not preserved.

In other words, if qsort is given two equal items to sort, it doesn't have to return them in the same order on one platform vs another. This has lead to #5263.

#5264 solves this by adding a custom stable sort instead of qsort.

This PR is an alternative (and feel free to favour #5264 instead if you so choose). It instead keeps qsort but ensures that no two items are equal, by adding an index to the items, so that even if one dimension is equal, the second one will not be. It also reverts #5363 in the process.

Both PRs have the same code added to the test suite.

raygard · 2021-03-31T15:31:45Z

I have looked over the code changes to Quant.c. Looks correct to me. I have made sure it builds in Windows. I assume it's been built in Linux and assume it has been tested.

A small quibble: the Shell sort in #5264 is not stable. The object was to get the same sort order results on any platform, and incorporating the sort in Quant.c accomplishes that (assuming a deterministic sort!). Making the sort stable also makes it consistent, so that also works.

The Shell sort is only a little slower than Quicksort until the array gets fairly large (and can be faster, e.g. if the data are already nearly in order). Generally the compare function contributes heavily, so a more complex compare (as here) can slow things down. I have not tried to profile or time these two approaches so cannot say which will be faster in practice.

I'm the new guy here and @radarhere is a longtime member, so I feel I have to tread lightly. But I am curious as to the motivation. This seems to add more complexity than just using an embedded sort, and you've gone to rather a lot of effort to do it. Is there a concern about the performance, or maybe about the correctness of my Shell sort?

This approach is a sort of use of the decorate-sort-undecorate idiom. You need to allocate a new array of structs. I don't know how much memory this might take in the worst case, but based on the if (nPaletteEntries > UINT32_MAX / nPaletteEntries) { check on lines 1182 and 1389, it appears there can be up to 2**16 palette entries, so up to UINT32_MAX elements of DistanceWithIndex allocated. That's about 32GB! I don't know what other limits there are on nPaletteEntries so that may not be a problem in practice.

But I see that there are only nEntries elements sorted at a time, and that you "undecorate" those as you sort each batch, so I suspect there is a way to only "decorate" nEntries at a time as well, just before each call to qsort(), so you'd need a lot less of dwi.

I also don't know if the added time to "decorate" and "undecorate" the DistanceWithIndex elements is of any significance. Probably minuscule compared with the rest of the work in quantization.

The name _sort_ulong_ptr_keys() is now no longer appropriate for the compare function.

Your construction of the DistanceWithIndex elements uses compound literals, which I think came in C99. Aside from // -style comments, are there other non-C90 features used in PIL? Does it matter?

You've not made quantize() and quantize2() static, as I had, but maybe that's best. They are not currently called outside Quant.c but maybe they could be in the future.

I'm sure you'll choose one or the other of these approaches, and I am fine with either, but if you go with keeping the system qsort(), I think it would be good to see if you can maybe build the sort keys inside the loop that contains the call to qsort() and need only nEntries elements of dwi.

raygard · 2021-03-31T18:31:12Z

I have definitely not thought this through, but would this suffice? Build avgDistSortKey as was done originally in build_distance_tables(), then:

    for (i = 0; i < nEntries; i++) {
        for (j = 0; j < nEntries; j++) {
            dwi[j] = (DistanceWithIndex){
                avgDistSortKey[i * nEntries + j],
                j
            };
        }    // ; -- removed superfluous semicolon! rdg 20210402
        qsort(
            dwi,
            nEntries,
            sizeof(DistanceWithIndex),
            _sort_ulong_ptr_keys);
        for (j = 0; j < nEntries; j++) {
            avgDistSortKey[i * nEntries + j] = dwi[j].distance;
        }
    }

Only need nEntries elements in dwi. This does build in Windows and seems to work in a quick test.

radarhere · 2021-04-01T03:44:09Z

You've not made quantize() and quantize2() static, as I had, but maybe that's best. They are not currently called outside Quant.c but maybe they could be in the future.

I've created #5374, to separate out that change from this discussion.

This reverts commit a4a38b8.

radarhere · 2021-04-02T10:12:25Z

My hope with this PR is that is adds less complexity - the nuances of a sort function seem more complicated to me than temporarily replacing a single variable with a struct. If you count it up, fewer lines are modified. And if our compare function is leaving room for interpretation, then I'd rather clarify that to address what I think of as the root of the problem.

I can appreciate that your opinion is probably that inserting a custom sort function is simpler from a big picture perspective. I guess my other concern is that we're potentially re-inventing the wheel by doing so.

Don't feel you have to tread lightly (I mean, politeness is still good... you know what I mean). I don't have strong opinions here. If your PR is merged instead, then onto the next issue.

You make a good point about the memory allocation. I've pushed a commit for that and to rename the function.

raygard · 2021-04-02T18:13:44Z

@radarhere With your change to reduce memory (which reduces complexity as well), I have withdrawn my PR. Your newer version generates exactly the same results as my similar suggestion above; I tested. So I feel good about this version.

I do get your concerns about the sort function. My Shell sort was adapted for use as qsort() in the uClibc stdlib (https://git.uclibc.org/uClibc/tree/libc/stdlib/stdlib.c or https://cgit.uclibc-ng.org/cgi/cgit/uclibc-ng.git/tree/libc/stdlib/stdlib.c near line 700) and various other places, and has been used for over 30 years, so I'm pretty confident in it :). Use it wherever you need a compact qsort().

raygard · 2021-06-15T00:01:21Z

@radarhere , I would like to see this get merged in the upcoming release. Then I will not have to make my own Windows build anymore. Are there any blockers keeping this from getting merged?

radarhere · 2021-06-15T03:29:57Z

Just a case of waiting for someone else from the Pillow team to review this.

hugovk · 2021-06-20T09:38:54Z

Are there any discernible performance impacts from this, for example resizing large images?

radarhere · 2021-06-20T12:07:29Z

Using https://photojournal.jpl.nasa.gov/jpeg/PIA24472.jpg (large enough that it triggers a DecompressionBombWarning), and

import timeit
from PIL import Image

def test():
	im = Image.open("PIA24472.jpg")
	im.resize((int(im.width/2), int(im.height/2)))
	im.close()

print(timeit.timeit(test, number=100))

With this PR, I get 108.989, 108.575 and 108.075
With master, I get 109.324, 109.241 and 107.471

So no obvious differences in terms of speed.

hugovk · 2021-06-20T12:52:36Z

How about resizing to something small, say (10, 10)?

radarhere · 2021-06-20T13:10:14Z

On master, 62.347, 61.433 and 61.384.
With this PR, 62.190, 61.543 and 61.649.

So still no obvious difference.

hugovk · 2021-06-20T18:57:00Z

Thanks for checking!

radarhere mentioned this pull request Mar 30, 2021

Embed sort function in Quant.c to get same results in Windows and Linux #5264

Closed

radarhere mentioned this pull request Apr 1, 2021

Changed quantize and quantize2 to static #5374

Merged

radarhere added 2 commits April 2, 2021 04:07

Revert "Removed return value of build_distance_tables"

7387ec2

This reverts commit a4a38b8.

Added second attribute to avoid unstable nature of qsort

6541bd7

radarhere force-pushed the quant branch from 576b95a to 6541bd7 Compare April 1, 2021 17:07

radarhere added 2 commits April 2, 2021 20:48

Reduced memory usage

6764650

Renamed function

a694300

radarhere added the Platform A catchall for platform-related label Apr 9, 2021

Merge branch 'master' into quant

c5f8869

hugovk merged commit ec74f3b into python-pillow:master Jun 20, 2021

radarhere deleted the quant branch June 20, 2021 22:11

radarhere mentioned this pull request Jun 20, 2021

Resizing produces different results for Pillow 6.2.2 and Pillow 8.0.1 #5039

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid unstable nature of qsort in Quant.c #5367

Avoid unstable nature of qsort in Quant.c #5367

radarhere commented Mar 30, 2021 •

edited

raygard commented Mar 31, 2021

raygard commented Mar 31, 2021 •

edited

radarhere commented Apr 1, 2021

radarhere commented Apr 2, 2021

raygard commented Apr 2, 2021

raygard commented Jun 15, 2021

radarhere commented Jun 15, 2021

hugovk commented Jun 20, 2021

radarhere commented Jun 20, 2021

hugovk commented Jun 20, 2021

radarhere commented Jun 20, 2021

hugovk commented Jun 20, 2021

Avoid unstable nature of qsort in Quant.c #5367

Avoid unstable nature of qsort in Quant.c #5367

Conversation

radarhere commented Mar 30, 2021 • edited

raygard commented Mar 31, 2021

raygard commented Mar 31, 2021 • edited

radarhere commented Apr 1, 2021

radarhere commented Apr 2, 2021

raygard commented Apr 2, 2021

raygard commented Jun 15, 2021

radarhere commented Jun 15, 2021

hugovk commented Jun 20, 2021

radarhere commented Jun 20, 2021

hugovk commented Jun 20, 2021

radarhere commented Jun 20, 2021

hugovk commented Jun 20, 2021

radarhere commented Mar 30, 2021 •

edited

raygard commented Mar 31, 2021 •

edited