-
-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: Segmentation fault when calling pytorch function after np.exp (numpy 1.21.2) #21714
Comments
I would suspect it to be related to gh-20405, that would cause pretty random stuff. That issue is fixed in 1.22.0 and later. I am not quite sure how old that issue was. The complexity is that there was the additional complexity of a compiler bug being involved. Will have to dig deeper, but it may be that the issue only "appeared" with a new GCC release, so at the time of the release all may have been fine, and now it is not because the |
From the discussion in gh-20356, I suspect that the bug would only occur with gcc 10. I wonder what the best thing is, also a bit related to gh-21713. EDIT: Not sure which gcc versions it appears or when/whether it got fixed. Older ones probably have not optimized as aggressively and did not show it. Maybe we should backport some of these at least as source-only, since channels like the nvidia one can then still pick them up or at least find them. EDIT: Nvm, the nvidia channel of course only has nvidia packages, this would be from the default anaconda channel. |
@kokamido I am not quite sure how to best proceed. Maybe you can confirm that this is on a machine with a SkylakeX CPU? It might be nice to confirm that the specific patch works, but that will require compiling NumPy on an affected machine (I don't have a skylakex machine here). If this is important to you to get a 1.21.x release that is guaranteed fixed, maybe we need to open an Anaconda issue? |
In my tests my repro works with both Intel Xeon Gold 5320T (which is Ice Lake) and Intel Core i7-11800H (which is Tiger Lake). And it doesn't reproduce with 1.21.3 and 1.21.6 from Anaconda (I haven't tested 1.21.4 and 1.21.5). |
Describe the issue:
Hi! There is an issue connected to numpy and pytorch. I can't reproduce it with numpy 1.21.3, but in 1.21.2 it exists. If I run provided code example with SIZE=15 then both print functions (they are exactly the same) will print True. If I run it with SIZE=20, the first print will display True but the second will crash because of segmentation fault. If I run it with SIZE=1000 it will display True and False. If I remove np.exp call the code will print True True for any positive int SIZE.
This behavior can be reproduced in the following docker container:
Reproduce the code example:
Error message:
No response
NumPy/Python version information:
numpy==1.21.2
pytorch==1.10.1
The text was updated successfully, but these errors were encountered: