Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BLD: Do not use clang default to ignore floating point exceptions #18005

Closed
tomgoddard opened this issue Dec 16, 2020 · 29 comments · Fixed by #18030 or #19049
Closed

BLD: Do not use clang default to ignore floating point exceptions #18005

tomgoddard opened this issue Dec 16, 2020 · 29 comments · Fixed by #18030 or #19049
Labels
00 - Bug 26 - Compiler component: numpy.ufunc sprintable Issue fits the time-frame and setting of a sprint

Comments

@tomgoddard
Copy link

Remarkably numpy.log() gives a runtime overflow warning on a float32 array containing values larger than 1e9. Numpy 1.19.4, macOS 10.15.7, MacBookPro15,3. Warning does not appear for float64 array even for much larger values (e.g. 1e35).

Reproducing code example:

import numpy as np
a = np.array([1e9], np.float32)
np.log(a)

Error message:

RuntimeWarning: overflow encountered in log

NumPy/Python version information:

1.19.4 3.8.6 (default, Nov 20 2020, 18:29:40)
[Clang 12.0.0 (clang-1200.0.32.27)]

@seberg
Copy link
Member

seberg commented Dec 16, 2020

@tomgoddard, I cannot reproduce this on linux. Since you are on 1.19.x, could you print:

np.core._multiarray_umath.__cpu_features__

Just so we have the information?

The most likely cause will be that you have AVX512 instructions available and our AVX512 code has an error here. If it is in the AV512 path, @r-devulap could you have a look?

Of course it could also be a clang bug in the glibc version...

@tomgoddard
Copy link
Author

tomgoddard commented Dec 16, 2020 via email

@tomgoddard
Copy link
Author

Numpy 1.19.4 was installed using brew on macOS 10.15.7.

@seiko2plus
Copy link
Member

seiko2plus commented Dec 16, 2020

could you run np.log(np.array([1e9]*8, np.float32)) and check if the overflow error still remains? newest versions of clang committing aggressive optimization to partial load operations lead to undefined the vector tail on AVX2.

@r-devulap
Copy link
Member

I don't see this problem when using NumPy built with gcc 9.3 or clang 10.0. Makes me think it has to something to do with clang 12. Still figuring out how to install clang 12 on my Ubuntu to reproduce the error :/

@tomgoddard
Copy link
Author

tomgoddard commented Dec 16, 2020 via email

@tomgoddard
Copy link
Author

tomgoddard commented Dec 16, 2020 via email

@r-devulap
Copy link
Member

confirmed that this behavior is only with clang (clang 10 and clang 12), and does not happen with gcc. It decides to set an overflow flag for a blend instruction (which is weird and incorrect) x = _mm256_blendv_ps(x, temp, denormal_mask); here:

x = _mm256_blendv_ps(x, temp, denormal_mask);

@seberg
Copy link
Member

seberg commented Dec 17, 2020

@r-devulap thanks for digging this down, I am assuming it is a clang bug. Can we file it there? I am not sure there is a good way for us to work around it in NumPy to begin with...

@r-devulap
Copy link
Member

Trying to narrow down to a simple C program to replicate the bug, unsuccessful so far ..

@r-devulap
Copy link
Member

r-devulap commented Dec 19, 2020

Found the culprit! clang optimizes out a blend instruction which was precisely meant to avoid this reported issue of an overflow warning. While the compiler was being clever, it is still a bug.

PR #18030 has a simple work around for this issue. Let me know if that works.

EDIT: didn't mean to include the code snippet.

@charris charris added this to the 1.19.5 release milestone Dec 19, 2020
seberg pushed a commit that referenced this issue Dec 19, 2020
* BUG: make a variable volatile to work around clang compiler bug

* Adding comments for relevance

* TST: Adding test to check for no overflow warnings in log

Fixes #18005
charris pushed a commit to charris/numpy that referenced this issue Dec 19, 2020
…y#18030)

* BUG: make a variable volatile to work around clang compiler bug

* Adding comments for relevance

* TST: Adding test to check for no overflow warnings in log

Fixes numpy#18005
@matthew-brett
Copy link
Contributor

Just checking - is this bug in clang reported somewhere?

charris pushed a commit to charris/numpy that referenced this issue Dec 19, 2020
…y#18030)

* BUG: make a variable volatile to work around clang compiler bug

* Adding comments for relevance

* TST: Adding test to check for no overflow warnings in log

Fixes numpy#18005
@r-devulap
Copy link
Member

@matthew-brett not sure, I couldn't find anything relevant but I didn't spend too much time searching. I am planning on submitting a bug report.

@matthew-brett
Copy link
Contributor

@r-devulap - I'm sure that would be worthwhile, especially as the bug seems to have survived through at least two releases - right?

@r-devulap
Copy link
Member

You can play with different version of compilers here https://godbolt.org/z/Yajjcs and looks like it got introduced somewhere between 7.0 (which produces the right code) and 8.0 and is still there in 12.0.

@h-vetinari
Copy link
Contributor

@r-devulap
Thanks for your efforts on this! Could you please reference the clang ticket here once you open it (or let people know if you won't be pursuing it anymore)?

@r-devulap
Copy link
Member

r-devulap commented Dec 21, 2020

So, it turns out this is expected behavior in clang. Clang has a flag to force strict floating point exception compliance by setting -ffp-exception-behavior=strict. The default, however, is set to ignore which means:

The compiler assumes that the exception status flags will not be read and that floating point exceptions will be masked

No wonder I have had trouble with clang before too. The docs has a section on it which you can find by searching for ffp-exception-behavior here: https://clang.llvm.org/docs/UsersManual.html

@mattip
Copy link
Member

mattip commented Dec 21, 2020

Should we add that flag if we are on clang?

@seberg
Copy link
Member

seberg commented Dec 21, 2020

Sounds like we definitely have to. I wonder if we worked around other clang optimizations that could have been avoided with that flag. Probably just me, but I don't like compilers defaulting to fast-math or this kind of thing...

@r-devulap
Copy link
Member

I would think so. Clang would end up disabling some floating point optimizations which will have some negative performance impact. Might be minimal though and probably necessary going forward to ensure FP exception compliance.

@r-devulap
Copy link
Member

Probably just me, but I don't like compilers defaulting to fast-math or this kind of thing...

Fortunately clang doesn't default to fast-math. But yes, I don't like this either.

@seberg seberg reopened this Dec 21, 2020
@seberg seberg changed the title log(1e9) gives RuntimeWarning: overflow encountered in log BLD: Do not use clang default to ignore floating point exceptions Dec 21, 2020
@seberg
Copy link
Member

seberg commented Dec 21, 2020

Reopened and changed title. We should probably have a quick look around for clang workaround, but I don't think we need to backport getting rid of those.

@r-devulap
Copy link
Member

This was the other time clang give me an incorrect FP flag: #13586 (comment) and the associated PR #13623

@seberg
Copy link
Member

seberg commented Jan 25, 2021

Hmmm, I thought I would look into this. But I am honestly not sure where we should be adding this flag, or if we actually should (additionally) even prod Python itself to add this flag. Am I right to think that most clang specific flags currently are inherited from Python itself?

@charris
Copy link
Member

charris commented Jan 29, 2021

What is the status of this?

@seberg
Copy link
Member

seberg commented Jan 29, 2021

I think this is important, but I am not very familiar with distutils, and am not sure where this flag should be inserted correctly. (It probably is also correct for SciPy, but do we want to set it by default for scipy as well?).
I am happy to check for the related code cleanups, but not too fond of digging through distutils to find the right place.

It shouldn't matter too much to push it off to 1.21 all known issues due to this are fixed after all. While the flag change is something we could do, the (probably) related code cleanup we do not want to backport anyway probably.

@seberg
Copy link
Member

seberg commented Mar 18, 2021

@rgommers I somewhat assume you just know where to put this. Can you point where we would add a clang specific compiler flag to distutils (maybe NumPy specific, although this flag is really also necessary for scipy, I bet).

@rgommers
Copy link
Member

In CCompiler_customize: https://github.com/numpy/numpy/blob/main/numpy/distutils/ccompiler.py#L366

maybe NumPy specific, although this flag is really also necessary for scipy, I bet

It's hard to make it NumPy-specific, and I think other libraries should get this flag too, so I'd say just add it unconditionally.

@seiko2plus
Copy link
Member

@seberb,

Can you point where we would add a clang specific compiler flag to distutils?

for a quickfix, we can link the flag with one of baseline features flags, etc SSE on x86.`

on_x86 = self.cc_on_x86 or self.cc_on_x64
is_unix = self.cc_is_gcc or self.cc_is_clang
if on_x86 and is_unix: return dict(
SSE = dict(flags="-msse"),
SSE2 = dict(flags="-msse2"),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug 26 - Compiler component: numpy.ufunc sprintable Issue fits the time-frame and setting of a sprint
Projects
None yet
9 participants