New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: ndimage: log(n) implementation for 1D rank filter #20543
base: main
Are you sure you want to change the base?
Conversation
I see many failed tests due to issues that seem to be not related to the current pull request. Is there anything I can/should do about it? |
If test failures are really unrelated, you can simply ignore them. However, I had a quick look and all the ones I saw do look related (e.g. |
You are right. I will correct it and resubmit. |
Thanks. Just to make sure: you can push new commits to the same branch that you opened this PR from, and they will show up here. No need to open a second PR. |
I had a quick look at the code, and the heavy use of templating and fused types jumped out to me. It results in a too-large size of the new extension module, it's immediately almost the largest one in the
Could this be limited to a subset of types? What does
The license of this code looks a little problematic. It has a broken link to something that (from the |
The performance improvement looks promising! If the scaling changed from |
The vertical axis is the kernel size but the improvement for the median filter is better than for large/small ranks. I wanted to demonstrate both of the aspects. In general, I understand that you are going to perform independent testing so this was for initial illustration only. Thanks for the comment anyway. I will consider it in the future. |
I am sorry, it was 3am and my brain was totally crashed :) |
cd87fc7
to
3cb7397
Compare
Concerning the license, the code contained in the referred link is also contained in Stackoverlow. Does it solve the problem? |
Unfortunately not. From http://scipy.github.io/devdocs/dev/core-dev/index.html#licensing: "For instance, code published on StackOverflow is covered by a CC-BY-SA license, which is not compatible due to the share-alike clause. These contributions cannot be accepted for inclusion in SciPy unless the original code author is willing to (re)license his/her code under the modified BSD (or compatible) license" |
I have reviewed this comment. In case I use the same Cython file, eliminating all the possible overhead by defining all variables' type in the file (all the functions) and defining the template in the cpp file as a function. In such a case, the size reduces to 190K. Currently, I reduced the usage of the fused types to the main types only (int64/float32/double), the rest of the types are casted to those types (leading to some instability in performance but keeping solid improvement over the current implementation). Two questions:
Also, I know that you are pretty busy and those are basic questions. Anyone else that can support me here or should I go solo/stackoverflow? Maybe I should be using other platform/try to involve other people in this PR? |
They aren't. Related things are under rather active discussion (see #20334 (comment) for a sample), and the jury is still out on what's the recommended way to wrap C/C++ kernels going forward. Up until very recently the standing recommendation was cython, just like you did. Maybe you are the perfect person to move this forward? So I think you've at least two options, and none of them pretty :-).
|
This is the thing we do pretty much everywhere, and is pretty standard and necessary. Cython adds overhead indeed, but not templating over a large set of types in C++ is . I tested again, on Linux x86-64 this time:
Turning off release mode (our default
263 kb is still surprisingly large after removal of most of the fused types usage. I'm actually a little puzzled by what is happening there. I did a quick test doubling the number of types under Changing Making
|
Hi,
A direct (GCC with flags) compilation produces smaller PS: any comments about the code (or something else) are more than welcome 🙏 @ev-br @rgommers or anyone else who reads it |
7729ee0
to
8651e0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the detailed experiments and explanation @ggkogan! Overall your conclusions sounds about right to me.
An experienced user of pybind11 indicated it creates smaller files but it is slower.
I think they meant function call overhead. Which is true indeed - but only relevant for small arrays, for large arrays (which is what we should mostly care about) that doesn't really matter.
Looks like this is down to 21 kb for a release build on Linux (and 87 kb with the python dev.py build
defaults):
$ ls -lh build/scipy/ndimage/_rank_filter_1d.cpython-311-x86_64-linux-gnu.so
-rwxr-xr-x 1 rgommers rgommers 21K 3 mei 14:44 build/scipy/ndimage/_rank_filter_1d.cpython-311-x86_64-linux-gnu.so
That's ~12x smaller than the Cython version, which is great.
(I have seen that Cython appears in .so file name therefore I assume we are using it as the compilation engine
Not Cython, but CPython: cpython-311-x86_64-linux-gnu.so
. The file extension contains info relevant to the ABI used.
I assume that the addition in the file size is another Cython artifact - as far as I see, no boost in performance is seen.
No, just depends on the optimization level etc. Release builds are -O3
, dev.py
builds are -O2 -g
.
py3.extension_module('_rank_filter_1d', | ||
'src/_rank_filter_1d.cpp', | ||
install: true, | ||
dependencies: np_dep, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a line here that is present for all other extension modules:
link_args: version_link_args,
could you please add that?
scipy/ndimage/_filters.py
Outdated
origins) | ||
if input.ndim == 1: | ||
rank = int(rank) | ||
origin = int(origin) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should already be integers. Why was this needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My assumption was similar to yours but I have added it after failing some unit-test. This was back when I used Cython as a glue and it marked that my input is np.int64 rather than int. I did not want to look for the origin of the problem as I did not want to modify anything unrelated to this pull request. I hoped someone would pay attention and it would be sufficiently visible to be corrected in future.
Anyway, I do not see it now (I assume that Numpy-C API is not sensitive to it) and therefore I removed the casting.
scipy/ndimage/_filters.py
Outdated
if mode == 6: | ||
mode = 4 | ||
if mode == 5: | ||
mode = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct indeed, as explained in the user guide: http://scipy.github.io/devdocs/tutorial/ndimage.html. It looks a bit out of place to do this here though. Not sure why it could not be done inside _extend_mode_to_code
if the modes are actually identical.
scipy/ndimage/_filters.py
Outdated
mode = 4 | ||
if mode == 5: | ||
mode = 1 | ||
casting_cond = input.dtype.name not in ['int64', 'float64', 'float32'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer using the actual dtypes: input.dtype not in (np.int64, np.float64, np.float32)
scipy/ndimage/_filters.py
Outdated
casting_cond = input.dtype.name not in ['int64', 'float64', 'float32'] | ||
if casting_cond: | ||
x = input.astype('int64') | ||
x_out = np.empty_like(x) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me like this should be using np.result_type
to determine what the correct output type is. E.g., for other floating-point dtypes, float32
or float64
will be expected, not int64
.
I optimized by restriction of cases and proper initialization, | ||
also adapted for rank filter rather than the original median filter. | ||
Allowed different boundary conditions. | ||
Moved to C++ for polymorphism and added C-Numpy API. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you remove the comments on evolution of the code, and copy exactly the copyright comment that was in the original file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it ok with you? I would not insist on the second line but some major changes have been made in did.
//Copyright (c) 2011 ashelly.myopenid.com under http://www.opensource.org/licenses/mit-license
//Modified in 2024 by Gideon Genadi Kogan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is fine with me.
Allowed different boundary conditions. | ||
Moved to C++ for polymorphism and added C-Numpy API. | ||
*/ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Style comment: could you please run clang-format
over this file?
//creates new Mediator: to calculate `nItems` running rank. | ||
Mediator* MediatorNew(int nItems, int rank) | ||
{ | ||
Mediator* m = (Mediator*)malloc(sizeof(Mediator)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to use new
instead of malloc
, since this is C++ code now.
for manual compilation, I did not see a file size change... |
@rgommers I have made all the required modifications, hopefully fitting the expectations :) |
Reference issue
Closes gh-20026
What does this implement/fix?
reducing the 1D rank filter run time from complexity from n to log n.
this is used in multiple functions:
https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.medfilt.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.median_filter.html
https://docs.scipy.org/doc/scipy/reference/generated/scipy.ndimage.rank_filter.html
https://docs.scipy.org/doc/scipy-1.12.0/reference/generated/scipy.ndimage.percentile_filter.html