Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove deferred reference count increments and make the global reference pool optional #4095

Merged
merged 10 commits into from May 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
4 changes: 4 additions & 0 deletions Cargo.toml
Expand Up @@ -108,6 +108,9 @@ auto-initialize = []
# Allows use of the deprecated "GIL Refs" APIs.
gil-refs = []

# Enables `Clone`ing references to Python objects `Py<T>` which panics if the GIL is not held.
py-clone = []

# Optimizes PyObject to Vec conversion and so on.
nightly = []

Expand All @@ -129,6 +132,7 @@ full = [
"num-bigint",
"num-complex",
"num-rational",
"py-clone",
"rust_decimal",
"serde",
"smallvec",
Expand Down
2 changes: 1 addition & 1 deletion examples/Cargo.toml
Expand Up @@ -10,5 +10,5 @@ pyo3 = { path = "..", features = ["auto-initialize", "extension-module"] }
[[example]]
name = "decorator"
path = "decorator/src/lib.rs"
crate_type = ["cdylib"]
crate-type = ["cdylib"]
doc-scrape-examples = true
4 changes: 3 additions & 1 deletion guide/src/class.md
Expand Up @@ -249,7 +249,7 @@ fn return_myclass() -> Py<MyClass> {

let obj = return_myclass();

Python::with_gil(|py| {
Python::with_gil(move |py| {
let bound = obj.bind(py); // Py<MyClass>::bind returns &Bound<'py, MyClass>
let obj_ref = bound.borrow(); // Get PyRef<T>
assert_eq!(obj_ref.num, 1);
Expand Down Expand Up @@ -280,6 +280,8 @@ let py_counter: Py<FrozenCounter> = Python::with_gil(|py| {
});

py_counter.get().value.fetch_add(1, Ordering::Relaxed);

Python::with_gil(move |_py| drop(py_counter));
```

Frozen classes are likely to become the default thereby guiding the PyO3 ecosystem towards a more deliberate application of interior mutability. Eventually, this should enable further optimizations of PyO3's internals and avoid downstream code paying the cost of interior mutability when it is not actually required.
Expand Down
7 changes: 5 additions & 2 deletions guide/src/faq.md
Expand Up @@ -127,12 +127,10 @@ If you don't want that cloning to happen, a workaround is to allocate the field
```rust
# use pyo3::prelude::*;
#[pyclass]
#[derive(Clone)]
struct Inner {/* fields omitted */}

#[pyclass]
struct Outer {
#[pyo3(get)]
adamreichold marked this conversation as resolved.
Show resolved Hide resolved
inner: Py<Inner>,
}

Expand All @@ -144,6 +142,11 @@ impl Outer {
inner: Py::new(py, Inner {})?,
})
}

#[getter]
fn inner(&self, py: Python<'_>) -> Py<Inner> {
self.inner.clone_ref(py)
}
}
```
This time `a` and `b` *are* the same object:
Expand Down
10 changes: 10 additions & 0 deletions guide/src/features.md
Expand Up @@ -75,6 +75,14 @@ This feature is a backwards-compatibility feature to allow continued use of the

This feature and the APIs it enables is expected to be removed in a future PyO3 version.

### `py-clone`

This feature was introduced to ease migration. It was found that delayed reference counts cannot be made sound and hence `Clon`ing an instance of `Py<T>` must panic without the GIL being held. To avoid migrations introducing new panics without warning, the `Clone` implementation itself is now gated behind this feature.

### `pyo3_disable_reference_pool`

This is a performance-oriented conditional compilation flag, e.g. [set via `$RUSTFLAGS`][set-configuration-options], which disabled the global reference pool and the assocaited overhead for the crossing the Python-Rust boundary. However, if enabled, `Drop`ping an instance of `Py<T>` without the GIL being held will abort the process.

### `macros`

This feature enables a dependency on the `pyo3-macros` crate, which provides the procedural macros portion of PyO3's API:
Expand Down Expand Up @@ -195,3 +203,5 @@ struct User {
### `smallvec`

Adds a dependency on [smallvec](https://docs.rs/smallvec) and enables conversions into its [`SmallVec`](https://docs.rs/smallvec/latest/smallvec/struct.SmallVec.html) type.

[set-configuration-options]: https://doc.rust-lang.org/reference/conditional-compilation.html#set-configuration-options
9 changes: 6 additions & 3 deletions guide/src/memory.md
Expand Up @@ -212,7 +212,8 @@ This example wasn't very interesting. We could have just used a GIL-bound
we are *not* holding the GIL?

```rust
# #![allow(unused_imports)]
# #![allow(unused_imports, dead_code)]
# #[cfg(not(pyo3_disable_reference_pool))] {
# use pyo3::prelude::*;
# use pyo3::types::PyString;
# fn main() -> PyResult<()> {
Expand All @@ -239,12 +240,14 @@ Python::with_gil(|py|
# }
# Ok(())
# }
# }
```

When `hello` is dropped *nothing* happens to the pointed-to memory on Python's
heap because nothing _can_ happen if we're not holding the GIL. Fortunately,
the memory isn't leaked. PyO3 keeps track of the memory internally and will
release it the next time we acquire the GIL.
the memory isn't leaked. If the `pyo3_disable_reference_pool` conditional compilation flag
is not enabled, PyO3 keeps track of the memory internally and will release it
the next time we acquire the GIL.

We can avoid the delay in releasing memory if we are careful to drop the
`Py<Any>` while the GIL is held.
Expand Down
13 changes: 11 additions & 2 deletions guide/src/migration.md
Expand Up @@ -35,7 +35,16 @@ fn increment(x: u64, amount: Option<u64>) -> u64 {
x + amount.unwrap_or(1)
}
```
</details>

### `Py::clone` is now gated behind the `py-clone` feature
<details open>
<summary><small>Click to expand</small></summary>
If you rely on `impl<T> Clone for Py<T>` to fulfil trait requirements imposed by existing Rust code written without PyO3-based code in mind, the newly introduced feature `py-clone` must be enabled.

However, take care to note that the behaviour is different from previous versions. If `Clone` was called without the GIL being held, we tried to delay the application of these reference count increments until PyO3-based code would re-acquire it. This turned out to be impossible to implement in a sound manner and hence was removed. Now, if `Clone` is called without the GIL being held, we panic instead for which calling code might not be prepared.

Related to this, we also added a `pyo3_disable_reference_pool` conditional compilation flag which removes the infrastructure necessary to apply delayed reference count decrements implied by `impl<T> Drop for Py<T>`. They do not appear to be a soundness hazard as they should lead to memory leaks in the worst case. However, the global synchronization adds significant overhead to cross the Python-Rust boundary. Enabling this feature will remove these costs and make the `Drop` implementation abort the process if called without the GIL being held instead.
</details>

## from 0.20.* to 0.21
Expand Down Expand Up @@ -676,7 +685,7 @@ drop(second);

The replacement is [`Python::with_gil`](https://docs.rs/pyo3/0.18.3/pyo3/marker/struct.Python.html#method.with_gil) which is more cumbersome but enforces the proper nesting by design, e.g.

```rust
```rust,ignore
# #![allow(dead_code)]
# use pyo3::prelude::*;

Expand All @@ -701,7 +710,7 @@ let second = Python::with_gil(|py| Object::new(py));
drop(first);
drop(second);

// Or it ensure releasing the inner lock before the outer one.
// Or it ensures releasing the inner lock before the outer one.
Python::with_gil(|py| {
let first = Object::new(py);
let second = Python::with_gil(|py| Object::new(py));
Expand Down
44 changes: 44 additions & 0 deletions guide/src/performance.md
Expand Up @@ -96,3 +96,47 @@ impl PartialEq<Foo> for FooBound<'_> {
}
}
```

## Disable the global reference pool

PyO3 uses global mutable state to keep track of deferred reference count updates implied by `impl<T> Drop for Py<T>` being called without the GIL being held. The necessary synchronization to obtain and apply these reference count updates when PyO3-based code next acquires the GIL is somewhat expensive and can become a significant part of the cost of crossing the Python-Rust boundary.

This functionality can be avoided by setting the `pyo3_disable_reference_pool` conditional compilation flag. This removes the global reference pool and the associated costs completely. However, it does _not_ remove the `Drop` implementation for `Py<T>` which is necessary to interoperate with existing Rust code written without PyO3-based code in mind. To stay compatible with the wider Rust ecosystem in these cases, we keep the implementation but abort when `Drop` is called without the GIL being held. If `pyo3_leak_on_drop_without_reference_pool` is additionally enabled, objects dropped without the GIL being held will be leaked instead which is always sound but might have determinal effects like resource exhaustion in the long term.

This limitation is important to keep in mind when this setting is used, especially when embedding Python code into a Rust application as it is quite easy to accidentally drop a `Py<T>` (or types containing it like `PyErr`, `PyBackedStr` or `PyBackedBytes`) returned from `Python::with_gil` without making sure to re-acquire the GIL beforehand. For example, the following code

```rust,ignore
# use pyo3::prelude::*;
# use pyo3::types::PyList;
let numbers: Py<PyList> = Python::with_gil(|py| PyList::empty_bound(py).unbind());

Python::with_gil(|py| {
numbers.bind(py).append(23).unwrap();
});

Python::with_gil(|py| {
numbers.bind(py).append(42).unwrap();
});
```

will abort if the list not explicitly disposed via

```rust
# use pyo3::prelude::*;
# use pyo3::types::PyList;
let numbers: Py<PyList> = Python::with_gil(|py| PyList::empty_bound(py).unbind());

Python::with_gil(|py| {
numbers.bind(py).append(23).unwrap();
});

Python::with_gil(|py| {
numbers.bind(py).append(42).unwrap();
});

Python::with_gil(move |py| {
drop(numbers);
});
```

[conditional-compilation]: https://doc.rust-lang.org/reference/conditional-compilation.html
1 change: 1 addition & 0 deletions newsfragments/4095.added.md
@@ -0,0 +1 @@
Add `pyo3_disable_reference_pool` conditional compilation flag to avoid the overhead of the global reference pool at the cost of known limitations as explained in the performance section of the guide.
1 change: 1 addition & 0 deletions newsfragments/4095.changed.md
@@ -0,0 +1 @@
`Clone`ing pointers into the Python heap has been moved behind the `py-clone` feature, as it must panic without the GIL being held as a soundness fix.
12 changes: 3 additions & 9 deletions pyo3-benches/benches/bench_gil.rs
@@ -1,4 +1,4 @@
use codspeed_criterion_compat::{criterion_group, criterion_main, BatchSize, Bencher, Criterion};
use codspeed_criterion_compat::{criterion_group, criterion_main, Bencher, Criterion};

use pyo3::prelude::*;

Expand All @@ -9,14 +9,8 @@ fn bench_clean_acquire_gil(b: &mut Bencher<'_>) {

fn bench_dirty_acquire_gil(b: &mut Bencher<'_>) {
let obj = Python::with_gil(|py| py.None());
b.iter_batched(
|| {
// Clone and drop an object so that the GILPool has work to do.
let _ = obj.clone();
},
|_| Python::with_gil(|_| {}),
BatchSize::NumBatches(1),
);
// Drop the returned clone of the object so that the reference pool has work to do.
b.iter(|| Python::with_gil(|py| obj.clone_ref(py)));
}

fn criterion_benchmark(c: &mut Criterion) {
Expand Down
2 changes: 2 additions & 0 deletions pyo3-build-config/src/lib.rs
Expand Up @@ -165,6 +165,8 @@ pub fn print_expected_cfgs() {
println!("cargo:rustc-check-cfg=cfg(GraalPy)");
println!("cargo:rustc-check-cfg=cfg(py_sys_config, values(\"Py_DEBUG\", \"Py_REF_DEBUG\", \"Py_TRACE_REFS\", \"COUNT_ALLOCS\"))");
println!("cargo:rustc-check-cfg=cfg(invalid_from_utf8_lint)");
println!("cargo:rustc-check-cfg=cfg(pyo3_disable_reference_pool)");
println!("cargo:rustc-check-cfg=cfg(pyo3_leak_on_drop_without_reference_pool)");

// allow `Py_3_*` cfgs from the minimum supported version up to the
// maximum minor version (+1 for development for the next)
Expand Down
2 changes: 1 addition & 1 deletion src/conversions/std/option.rs
Expand Up @@ -61,7 +61,7 @@ mod tests {
assert_eq!(option.as_ptr(), std::ptr::null_mut());

let none = py.None();
option = Some(none.clone());
option = Some(none.clone_ref(py));

let ref_cnt = none.get_refcnt(py);
assert_eq!(option.as_ptr(), none.as_ptr());
Expand Down
14 changes: 13 additions & 1 deletion src/err/err_state.rs
Expand Up @@ -5,7 +5,6 @@ use crate::{
Bound, IntoPy, Py, PyAny, PyObject, PyTypeInfo, Python,
};

#[derive(Clone)]
pub(crate) struct PyErrStateNormalized {
#[cfg(not(Py_3_12))]
ptype: Py<PyType>,
Expand Down Expand Up @@ -63,6 +62,19 @@ impl PyErrStateNormalized {
ptraceback: Py::from_owned_ptr_or_opt(py, ptraceback),
}
}

pub fn clone_ref(&self, py: Python<'_>) -> Self {
Self {
#[cfg(not(Py_3_12))]
ptype: self.ptype.clone_ref(py),
pvalue: self.pvalue.clone_ref(py),
#[cfg(not(Py_3_12))]
ptraceback: self
.ptraceback
.as_ref()
.map(|ptraceback| ptraceback.clone_ref(py)),
}
}
}

pub(crate) struct PyErrStateLazyFnOutput {
Expand Down
2 changes: 1 addition & 1 deletion src/err/mod.rs
Expand Up @@ -837,7 +837,7 @@ impl PyErr {
/// ```
#[inline]
pub fn clone_ref(&self, py: Python<'_>) -> PyErr {
PyErr::from_state(PyErrState::Normalized(self.normalized(py).clone()))
PyErr::from_state(PyErrState::Normalized(self.normalized(py).clone_ref(py)))
}

/// Return the cause (either an exception instance, or None, set by `raise ... from ...`)
Expand Down