-
-
Notifications
You must be signed in to change notification settings - Fork 29.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make TSAN tests pass with the GIL disabled in free-threaded builds #117657
Open
18 of 60 tasks
Tracked by
#108219
Labels
Comments
colesbury
added a commit
to colesbury/cpython
that referenced
this issue
Apr 12, 2024
colesbury
pushed a commit
that referenced
this issue
Apr 15, 2024
Additionally, reduce the iterations for a few weakref tests that would otherwise take a prohibitively long amount of time (> 1 hour) when TSAN is enabled and the GIL is disabled.
colesbury
pushed a commit
that referenced
this issue
Apr 15, 2024
…rld()` and `tstate_try_attach()` (#117828) TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int` on `tstate->state` in `tstate_try_attach()` and the non-atomic load of `tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails, but TSAN erroneously treats it as a store.
colesbury
pushed a commit
that referenced
this issue
Apr 15, 2024
This was referenced Apr 17, 2024
diegorusso
pushed a commit
to diegorusso/cpython
that referenced
this issue
Apr 17, 2024
…ython#117736) Additionally, reduce the iterations for a few weakref tests that would otherwise take a prohibitively long amount of time (> 1 hour) when TSAN is enabled and the GIL is disabled.
diegorusso
pushed a commit
to diegorusso/cpython
that referenced
this issue
Apr 17, 2024
…the_world()` and `tstate_try_attach()` (python#117828) TSAN erroneously reports a data race between the `_Py_atomic_compare_exchange_int` on `tstate->state` in `tstate_try_attach()` and the non-atomic load of `tstate->state` in `start_the_world`. The `_Py_atomic_compare_exchange_int` fails, but TSAN erroneously treats it as a store.
diegorusso
pushed a commit
to diegorusso/cpython
that referenced
this issue
Apr 17, 2024
…f compare/exchange (python#117830)
DinoV
pushed a commit
that referenced
this issue
Apr 17, 2024
…#117954) Fix data races in the method cache in free-threaded builds These are technically data races, but I think they're benign (to the extent that that is actually possible). We update cache entries non-atomically but read them atomically from another thread, and there's nothing that establishes a happens-before relationship between the reads and writes that I can see.
DinoV
added a commit
to mpage/cpython
that referenced
this issue
Apr 17, 2024
DinoV
pushed a commit
that referenced
this issue
Apr 17, 2024
…#117955) Quiet erroneous TSAN reports of data races in `_PySeqLock` TSAN reports a couple of data races between the compare/exchange in `_PySeqLock_LockWrite` and the non-atomic loads in `_PySeqLock_{Abandon,Unlock}Write`. This is another instance of TSAN incorrectly modeling failed compare/exchange as a write instead of a load.
This was referenced Apr 18, 2024
DinoV
added a commit
that referenced
this issue
Apr 19, 2024
…th strong enough semantics (#118111) Use acquire for load of ob_ref_shared
DinoV
pushed a commit
that referenced
this issue
Apr 23, 2024
… `tstate->state` (#118165) Quiet TSAN warnings about remaining non-atomic accesses of `tstate->state`
This was referenced Apr 23, 2024
colesbury
added a commit
that referenced
this issue
Jun 3, 2024
#120005) The free-threaded build currently immortalizes objects that use deferred reference counting (see gh-117783). This typically happens once the first non-main thread is created, but the behavior can be suppressed for tests, in subinterpreters, or during a compile() call. This fixes a race condition involving the tracking of whether the behavior is suppressed. (cherry picked from commit 47fb432)
colesbury
added a commit
that referenced
this issue
Jun 3, 2024
mliezun
pushed a commit
to mliezun/cpython
that referenced
this issue
Jun 3, 2024
…ython#119921) The GIL may be disabled concurrently with this call so we need to use a relaxed atomic load.
mliezun
pushed a commit
to mliezun/cpython
that referenced
this issue
Jun 3, 2024
mliezun
pushed a commit
to mliezun/cpython
that referenced
this issue
Jun 3, 2024
The `sem_clockwait` function is not currently instrumented, which leads to false positives.
mliezun
pushed a commit
to mliezun/cpython
that referenced
this issue
Jun 3, 2024
) The free-threaded build currently immortalizes objects that use deferred reference counting (see pythongh-117783). This typically happens once the first non-main thread is created, but the behavior can be suppressed for tests, in subinterpreters, or during a compile() call. This fixes a race condition involving the tracking of whether the behavior is suppressed.
colesbury
added a commit
that referenced
this issue
Jun 4, 2024
The `_PyThreadState_Bind()` function is called before the first `PyEval_AcquireThread()` so it's not synchronized with the stop the world GC. We had a race where `gc_visit_heaps()` might visit a thread's heap while it's being initialized. Use a simple atomic int to avoid visiting heaps for threads that are not yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called). The race was reproducible by running: `python Lib/test/test_importlib/partial/pool_in_threads.py`.
miss-islington
pushed a commit
to miss-islington/cpython
that referenced
this issue
Jun 4, 2024
…nGH-119923) The `_PyThreadState_Bind()` function is called before the first `PyEval_AcquireThread()` so it's not synchronized with the stop the world GC. We had a race where `gc_visit_heaps()` might visit a thread's heap while it's being initialized. Use a simple atomic int to avoid visiting heaps for threads that are not yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called). The race was reproducible by running: `python Lib/test/test_importlib/partial/pool_in_threads.py`. (cherry picked from commit e69d068) Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury
added a commit
to colesbury/cpython
that referenced
this issue
Jun 4, 2024
colesbury
added a commit
that referenced
this issue
Jun 4, 2024
…19923) (#120038) The `_PyThreadState_Bind()` function is called before the first `PyEval_AcquireThread()` so it's not synchronized with the stop the world GC. We had a race where `gc_visit_heaps()` might visit a thread's heap while it's being initialized. Use a simple atomic int to avoid visiting heaps for threads that are not yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called). The race was reproducible by running: `python Lib/test/test_importlib/partial/pool_in_threads.py`. (cherry picked from commit e69d068) Co-authored-by: Sam Gross <colesbury@gmail.com>
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
Only call `gc_restore_tid()` from stop-the-world contexts. `worklist_pop()` can be called while other threads are running, so use a relaxed atomic to modify `ob_tid`.
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
Due to a limitation in TSAN, all reads from `PyThreadState.state` must be atomic to avoid reported races.
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
…#119908) Seen in CI occasionally when running `test_weakref`.
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
…ython#119921) The GIL may be disabled concurrently with this call so we need to use a relaxed atomic load.
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
The `sem_clockwait` function is not currently instrumented, which leads to false positives.
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
) The free-threaded build currently immortalizes objects that use deferred reference counting (see pythongh-117783). This typically happens once the first non-main thread is created, but the behavior can be suppressed for tests, in subinterpreters, or during a compile() call. This fixes a race condition involving the tracking of whether the behavior is suppressed.
barneygale
pushed a commit
to barneygale/cpython
that referenced
this issue
Jun 5, 2024
…n#119923) The `_PyThreadState_Bind()` function is called before the first `PyEval_AcquireThread()` so it's not synchronized with the stop the world GC. We had a race where `gc_visit_heaps()` might visit a thread's heap while it's being initialized. Use a simple atomic int to avoid visiting heaps for threads that are not yet fully initialized (i.e., before `tstate_mimalloc_bind()` is called). The race was reproducible by running: `python Lib/test/test_importlib/partial/pool_in_threads.py`.
colesbury
added a commit
that referenced
this issue
Jun 6, 2024
This adds a `_PyRecursiveMutex` type based on `PyMutex` and uses that for the import lock. This fixes some data races in the free-threaded build and generally simplifies the import lock code.
miss-islington
pushed a commit
to miss-islington/cpython
that referenced
this issue
Jun 6, 2024
This adds a `_PyRecursiveMutex` type based on `PyMutex` and uses that for the import lock. This fixes some data races in the free-threaded build and generally simplifies the import lock code. (cherry picked from commit e21057b) Co-authored-by: Sam Gross <colesbury@gmail.com>
colesbury
added a commit
to colesbury/cpython
that referenced
this issue
Jun 6, 2024
This was referenced Jun 6, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Feature or enhancement
We need to make the TSAN tests pass with the GIL disabled before we can disable the GIL by default in free-threaded builds.
This should proceed in two phases:
How to run the TSAN tests
env CC=clang CXX=clang++ ./configure --disable-gil --with-thread-sanitizer --with-pydebug && make -j
env TSAN_OPTIONS=suppressions=<repo_root>/Tools/tsan/suppressions_free_threading.txt ./python -m test --tsan -j4
Working on a race
<repo_root>/Tools/tsan/suppressions_free_threading.txt
, run the TSAN tests, and verify that the race is reported by TSAN. You may need to comment out unrelated functions (notably,_PyEval_EvalFrameDefault
) in order to reproduce the race.Suppressions
Tasks
Linked PRs
start_the_world()
andtstate_try_attach()
#117828_PySeqLock
#117955tstate->state
#118165_Py_IsOwnedByCurrentThread()
#118258_Py_IsImmortal
#118261ob_gc_bits
#118292ob_gc_bits
(GH-118292) #118796SemLock.__{enter,exit}__
#118812SemLock.__{enter,exit}__
(GH-118812) #118856interp->threads.main
#118865interp->threads.main
(GH-118865) #118904PyMember_Get
andPyMember_Set
, forPy_T_OBJECT_EX
type #119368set_discard_entry
#119908set_discard_entry
(GH-119908) #119914sem_clockwait
in TSAN #119915_PyEval_IsGILEnabled
. #119921_PyEval_IsGILEnabled
. (GH-119921) #119939sem_clockwait
in TSAN (GH-119915) #119992The text was updated successfully, but these errors were encountered: