New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use explicit memcpy() to avoid unaligned memory accesses #3225
Conversation
Personally, I don't like it because, while compiler will likely optimize Do you have examples of such fixes in other widespread libraries? |
Is python common enough? For fixed size copies you must try very hard for the compiler not to optimize this on e.g. x86, this is one of the most common intrinsics that are in use. |
What is the way forward? I bet this needs updating, but I'm only going to do this if it would be accepted afterwards. |
@homm Any comments? |
I would rebase it if there is a chance that this get's merged in time then, otherwise it is too much work for nothing. |
There's been no objections from @homm or anyone else in a year, so let's merge this for Monday's release, if possible. Thank you! |
This replaces trivial instances where a copy from one pointer to the other involves no further calculations or casts. The compiler will optimize this to whatever the platform offers.
@@ -480,86 +479,78 @@ void | |||
ImagingUnpackRGB(UINT8* _out, const UINT8* in, int pixels) | |||
{ | |||
int i = 0; | |||
#ifdef __sparc | |||
/* SPARC CPUs cannot read integers from nonaligned addresses. */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are the removals of these exceptions for SPARC intentional?
They were only added recently, in #3858.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, by using memcpy the compiler will care for the correct access pattern.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, thanks.
cc @kulikjak fyi.
I have tested this on my sparc machine and I see some test errors and I suspect them being endianess issues. I guess this is still better than crashes. I'll probably need a few days to look into this. |
Ok, I was even faster than expected. When I compare the test results of my patch on sparc with current git master on hppa I have only these additional failures: Tests/test_imagefont.py::TestImageFont::test_variation_get FAILED Please decide if they are likely caused by this patches or not, I have no clue. |
I have tested it on my hppa machine (also big endian) with and without this patch, and the font errors are also present there in both runs. So I don't see any regressions with that patch. |
Thanks for testing. As no regressions found, let's merge it. |
Fixes #3213.
This replaces casts of incoming memory with explicit memcpy(). Some platforms, especially Sparc, but also ARM and Itanium, do not like unaligned memory accesses and may lack from slowdowns or crashes as result of those if the incoming memory is not aligned to the expected access type. Using memcpy() will avoid this: the compiler will optimize this to direct unaligned accesses on platforms that support those operations, and will do whatever copy operation is needed on platforms that do not.