New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stack overflow caused by clumsy array coercion #17785
Comments
If I run |
For what its worth, on numpy master it gives a I think on 1.19.x the same recursion error should be happening, but as per comment add in gh-14745 (which did not change the behaviour). We ignore errors during dimension discovery, which ends up using 32 here, which then apparently ends up as a bomb later? If I set This all seems strange, and I still didn't quite follow why it means you get a stack overflow (I was only able to reproduce memory eating/hanging and a proper |
To be clear - on 1.19.4 with my code unmodified, do you get a recursion error or a stack overflow? My result is on windows, and the issue I link to was OSX |
1.19.4 with your code unmodified I get hanging behaviour. Not sure it is something that I did, but my recursion limit is 1000 by default. There may be a sweet-spot for the recursion limit where it gets the stack overflow? Or the stack-overflow issue for some reason doesn't kick in on my system. I have a transition from RecursionError to hanging at |
@eric-wieser OK, I think I will try making the dimensions discovery use the logic:
instead and hooe that fixes it. Probably CI will be able to reproduce the issue, so we should just add this as a test to master as well and see if it indeed raises an error correctly on all systems now. |
Any reason for using the more general |
No, just thought that RuntimeError's are generally bad enough that we should abort. But this is for the 1.19.x branch, so should probably just do the minimal thing. Also, if we go there catching e.g. See gh-17786 for a hopeful fix. We should forward-port the test before closing this. |
So, my guess is that this is similar to: class A():
def __new__(cls, arg):
try:
return A([arg])
except:
return A(arg)
A(None) which on my system also kills python. Albeit with fatal recursion error in Python itself and not a stack overflow on the OS level as happens in the NumPy issue.
If someone can explain my why this goes well beyond the original recursion limit (by 50), I might investigate more... Until then... I will change the PR against master to allow success (which is probably fine anyway), and can modify the other PR to not include the test and maybe see what is really necessary to catch most of it, but I doubt its easy to get it watertight without some deeper understanding of why it explodes as badly as it does. |
This tests that the bug reported in numpygh-17785 is already fixed on master.
This changes it so that we only ignore attribute errors on looking up `__array__` and propagate errors when checking for sequences `len(obj)` if those errors are either RecursionError or MemoryError (we consider them unrecoverable). Also adds test for bad recursive array-like with sequence as reported in numpygh-17785. The test might be flaky/more complicated in which case it should probably just be deleted.
This mitigates numpygh-17785 on the 1.19.x branch. Note that the actual problem seems to have more layers, so that the bad code will still cause crashes on certain systems or call contexts (I am not sure). Pre 1.19.x did not check for `__array__` in the dimension discovery code, so was unaffected by the issue in numpygh-17785.
This changes it so that we only ignore attribute errors on looking up `__array__` and propagate errors when checking for sequences `len(obj)` if those errors are either RecursionError or MemoryError (we consider them unrecoverable). Also adds test for bad recursive array-like with sequence as reported in numpygh-17785. The test might be flaky/more complicated in which case it should probably just be deleted.
Since this issues is one of two still tagged for |
@h-vetinari There is #17786 for 1.19.5. |
Closing, has been fixed in 1.19, 1.20, and master. |
There may still be some things slipping through. I hope those remaining are "only" fatal Python errors. But I guess we have as much as is likely to go into 1.19.x. |
Reproducing code example:
Over at pygae/clifford#376, we've found that the adjustments to array coercion in 1.19.x seem to produce stack overflows where previously they were safe. A minimal repro is this clumsy coercion code, which mirrors what
clifford.MVArray
did in their 1.3.0 release.Error message:
NumPy/Python version information:
Thankfully the code above was removed since it was bad for other reasons, but its unfortunate that users on an old release get a segfault.
The text was updated successfully, but these errors were encountered: