Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sage/matrix/matrix_integer_dense.pyx doctest sometimes breaks with time out #707

Open
kiwifb opened this issue Aug 2, 2022 · 8 comments

Comments

@kiwifb
Copy link
Collaborator

kiwifb commented Aug 2, 2022

sage -t --long --random-seed=4867623489143374956615441254140194808 /usr/lib/python3.10/site-packages/sage/matrix/matrix_integer_dense.pyx  # Timed out (and interrupt failed)

It doesn't always fail. But it related to using openblas with threads. Switching openblas to use openmp will make the issue go away. It is unclear if switching to another blas also fixes it. It needs to be tested.

@strogdon
Copy link
Contributor

strogdon commented Aug 3, 2022

A data point. I do see the failure on s-o-g but not so far on vanilla. Vanilla here uses system
openblas [ pthread, -openmp ] and system singular. The s-o-g failure

sage: a = matrix(ZZ,2,[1,-7,3,5]) ## line 5597 ##
sage: a._change_ring(RDF) ## line 5598 ##
[ 1.0 -7.0]
[ 3.0  5.0]
sage: sig_on_count() # check sig_on/off pairings (virtual doctest) ## line 5601 ##
0
sage: A = matrix(ZZ, 3, 3, [-8, 2, 0, 0, 1, -1, 2, 1, -95]) ## line 5621 ##
sage: As = singular(A); As ## line 5622 ##

@strogdon
Copy link
Contributor

strogdon commented Aug 8, 2022

Another data point - s-o-g does not have openblas as a NEEDED lib.

On vanilla
$ objdump -p src/sage/matrix/matrix_integer_dense.cpython-310-x86_64-linux-gnu.so | grep NEEDED
  NEEDED               libiml.so.0
  NEEDED               libgmp.so.10
  NEEDED               libopenblas.so.0
  NEEDED               libpari-gmp-tls.so.7
  NEEDED               libflint.so.16
  NEEDED               libm.so.6
  NEEDED               libc.so.6

versus on Gentoo

$ objdump -p  /usr/lib/python3.10/site-packages/sage/matrix/matrix_integer_dense.cpython-310-x86_64-linux-gnu.so | grep NEEDED
  NEEDED               libiml.so.0
  NEEDED               libpari-gmp-tls.so.7
  NEEDED               libflint.so.16
  NEEDED               libgmp.so.10
  NEEDED               libm.so.6
  NEEDED               libc.so.6

@strogdon
Copy link
Contributor

strogdon commented Aug 8, 2022

needed libs may not be an issue. On my gentoo-prefix I don't see a doctest failure.

@kiwifb
Copy link
Collaborator Author

kiwifb commented Aug 8, 2022

It shouldn't be an issue. blas is not used directly, it should be pulled by iml.

@strogdon
Copy link
Contributor

strogdon commented Aug 9, 2022

I'm able to get the time out (/storage/strogdon/gentoo-rap/usr/lib64/libopenblas.so.0(blas_thread_shutdown_+0xbf)[0x7ffb0a90889f]) on gentoo-prefix when doctesting the folder

sage -tp 9 --long ~/usr/lib/python3.10/site-packages/sage/matrix/

I have not been able to get vanilla to fail when doctesting the above folder.

@strogdon
Copy link
Contributor

From src/bin/sage-env there is

# Multithreading in OpenBLAS does not seem to play well with Sage's attempts to
# spawn new processes, see #26118. Apparently, OpenBLAS sets the thread
# affinity and, e.g., parallel doctest jobs, remain on the same core.
# Disabling that thread-affinity with OPENBLAS_MAIN_FREE=1 leads to hangs in
# some computations.
# So we disable OpenBLAS' threading completely; we might loose some performance
# here but strangely the opposite seems to be the case. Note that callers such
# as LinBox use a single-threaded OpenBLAS anyway.
export OPENBLAS_NUM_THREADS=1

Does this mean that OPENBLAS_NUM_THREADS=1 during doctests? In any event I get non-failing results with

OPENBLAS_NUM_THREADS=1 sage -t --long /usr/lib/python3.10/site-packages/sage/matrix/matrix_integer_dense.pyx

I'm not sure what s-o-g does relative to OPENBLAS_NUM_THREADS.

@kiwifb
Copy link
Collaborator Author

kiwifb commented Aug 11, 2022

I do nothing about it. If we were to add something, it may have to live in sage-runtest. But yes it means the whole of vanilla sage runs basically without threads unless something overrides it. It is a bit misguided to only consider linbox, scipy uses lapack for some stuff and so does iml which is where the issue come from.

@kiwifb
Copy link
Collaborator Author

kiwifb commented Aug 11, 2022

Setting OPENBLAS_NUM_THREADS definitely has an impact here. I will think about what to do about it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants