Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a safe API wrapping PyEval_SetProfile #4008

Open
LilyFoote opened this issue Mar 28, 2024 · 9 comments · May be fixed by #4039
Open

Implement a safe API wrapping PyEval_SetProfile #4008

LilyFoote opened this issue Mar 28, 2024 · 9 comments · May be fixed by #4039

Comments

@LilyFoote
Copy link
Contributor

LilyFoote commented Mar 28, 2024

PyEval_SetProfile is the C-api equivalent to sys.set_profile. Using PyEval_SetProfile from Rust is preferable to using sys.set_profile because it avoids overhead, but currently requires using unsafe. I would like to add a new safe api to replace code like the following:

/// Wrap pyo3-ffi/src/cpython/pystate.rs#L18-L25
enum Event {
    Call,
    Exception,
    Line,
    Return,
    CCall,
    CException,
    CReturn,
    Opcode,
}

impl Event {
    fn from_c(what: c_int) -> Self {
         match what {
             PyTrace_CALL => Self::Call,
             PyTrace_EXCEPTION => Self::Exception,
             PyTrace_LINE => Self::Line,
             PyTrace_RETURN => Self::Return,
             PyTrace_C_CALL => Self::CCall,
             PyTrace_C_EXCEPTION => Self::CException,
             PyTrace_C_RETURN => Self::CReturn,
             PyTrace_OPCODE => Self::Opcode,
         }
    }
}

#[pyclass]
struct Profiler {
    // Useful fields
}

impl Profiler {
    fn profile(
        &mut self,
        frame: PyObject,
        arg: Option<PyObject>,
        event: Event,
        py: Python,
    ) -> PyResult<()> {
        // Custom profiling logic
        Ok(())
    }
}

pub extern "C" fn profile_callback(
    _obj: *mut ffi::PyObject,
    _frame: *mut ffi::PyFrameObject,
    what: c_int,
    _arg: *mut ffi::PyObject,
) -> c_int {
    let event = Event::from_c(what);
    // An optimisation for my use case that might not be worth trying to allow upstream
    // match event {
    //     Event::Call => (),
    //     Event::Return => (),
    //     _ => return 0;
    //}
    let _frame = _frame as *mut ffi::PyObject;
    Python::with_gil(|py| {
        // Safety:
        //
        // `from_borrowed_ptr_or_err` must be called in an unsafe block.
        //
        // `_obj` is a reference to our `Profiler` wrapped up in a Python object, so
        // we can safely convert it from an `ffi::PyObject` to a `PyObject`.
        //
        // We borrow the object so we don't break reference counting.
        //
        // https://docs.rs/pyo3/latest/pyo3/struct.Py.html#method.from_borrowed_ptr_or_err
        // https://docs.python.org/3/c-api/init.html#c.Py_tracefunc
        let obj = match unsafe { PyObject::from_borrowed_ptr_or_err(py, _obj) } {
            Ok(obj) => obj,
            Err(err) => {
                err.restore(py);
                return -1;
            }
        };
        let mut profiler = match obj.extract::<PyRefMut<Profiler>>(py) {
            Ok(profiler) => profiler,
            Err(err) => {
                err.restore(py);
                return -1;
            }
        };

        // Safety:
        //
        // `from_borrowed_ptr_or_err` must be called in an unsafe block.
        //
        // `_frame` is an `ffi::PyFrameObject` which can be converted safely
        // to a `PyObject`. We can later convert it into a `pyo3::types::PyFrame`.
        //
        // We borrow the object so we don't break reference counting.
        //
        // https://docs.rs/pyo3/latest/pyo3/struct.Py.html#method.from_borrowed_ptr_or_err
        // https://docs.python.org/3/c-api/init.html#c.Py_tracefunc
        let frame = match unsafe { PyObject::from_borrowed_ptr_or_err(py, _frame) } {
            Ok(frame) => frame,
            Err(err) => {
                err.restore(py);
                return -1;
            }
        };

        // Safety:
        //
        // `from_borrowed_ptr_or_opt` must be called in an unsafe block.
        //
        // `_arg` is either a `Py_None` (PyTrace_CALL) or any PyObject (PyTrace_RETURN) or
        // NULL (PyTrace_RETURN).
        //
        // We borrow the object so we don't break reference counting.
        //
        // https://docs.rs/pyo3/latest/pyo3/struct.Py.html#method.from_borrowed_ptr_or_opt
        // https://docs.python.org/3/c-api/init.html#c.Py_tracefunc
        let arg = unsafe { PyObject::from_borrowed_ptr_or_opt(py, _arg) };
        // `_arg` is `NULL` when the frame exits with an exception unwinding instead of a normal return.
        // So it might be possible to make `arg` a `PyResult` here instead of an option, but I haven't worked out the detail of how that would work. 

        match profiler.profile(frame, arg, event, py) {
            Ok(_) => 0,
            Err(err) => {
                err.restore(py);
                return -1;
            }
        }
    })

}

#[pyfunction]
fn register_profiler() -> PyResult<()> {
    Python::with_gil(|py| {
        let profiler = Profiler::new();
        unsafe {
            ffi::PyEval_SetProfile(Some(profile_callback), profiler.into_ptr());
        }
    }
}

Using the safe API could look something like:

#[pyclass]
struct Profiler {
    // Useful fields
}

impl Profiler {
    fn profile(
        &mut self,
        frame: PyObject,
        arg: Option<PyObject>,
        event: pyo3::introspection::Event,
        py: Python,
    ) -> PyResult<()> {
        // Custom profiling logic
        Ok(())
    }
}

#[pyfunction]
fn register_profiler() -> PyResult<()> {
    Python::with_gil(|py| {
        let profiler = Profiler::new();
        pyo3::introspection::set_profile(profiler.profile, profiler)?;
        Ok(())
    }
}

In the C api it is also possible to omit the Profiler struct and just pass a profiling callback. It would be nice to support this too, but I'm not sure there's a clean way without a separate function:

fn profile(
    frame: PyObject,
    arg: Option<PyObject>,
    event: pyo3::introspection::Event,
    py: Python,
) -> PyResult<()> {
    // Custom profiling logic
    Ok(())
}

#[pyfunction]
fn register_profiler() -> PyResult<()> {
    Python::with_gil(|py| {
        pyo3::introspection::set_profile2(profile)?;
        Ok(())
    }
}
@LilyFoote
Copy link
Contributor Author

I'm sure there's some stuff that wouldn't actually work as I've written it, but I hope it's clear enough. We could also support a safe version of PyEval_SetTrace (sys.set_trace) and perhaps sys.monitoring, but I'll leave those to a future issue if we land this one.

@flying-sheep
Copy link

flying-sheep commented Mar 29, 2024

Huh, what a coincidence, I just wanted to use this, and you gave me something to start with. Thank you!

I added a test implementation: https://github.com/flying-sheep/fine-coverage/blob/4412d07e69d0c8ef9ca3b2267940459d1e2dfb19/src/lib.rs#L27-L56

Also I modified your code to be more generic: https://github.com/flying-sheep/fine-coverage/blob/4412d07e69d0c8ef9ca3b2267940459d1e2dfb19/src/tracer.rs

  • Tracer trait instead of struct
  • events carry data (arg is gone)
  • both trace and profile

Just registering it and quitting the interpreter currently logs this:

[src/lib.rs:39:17] "exception" = "exception"
[src/lib.rs:39:17] frame = <frame at 0x78805446af80, file '/home/phil/.local/share/hatch/env/virtual/fine-coverage/3HsL-UUb/fine-coverage/bin/fine-coverage', line 8, code <module>>
[src/lib.rs:39:17] exc_type.into_bound(py) = <class 'SystemExit'>
[src/lib.rs:39:17] exc_value.into_bound(py) = SystemExit()
[src/lib.rs:39:17] exc_traceback.into_bound(py) = <traceback object at 0x788053fce300>

[src/lib.rs:48:17] "return" = "return"
[src/lib.rs:48:17] frame = <frame at 0x78805446af80, file '/home/phil/.local/share/hatch/env/virtual/fine-coverage/3HsL-UUb/fine-coverage/bin/fine-coverage', line 8, code <module>>
[src/lib.rs:48:17] value.map(|value| value.into_bound(py)) = None

Rust continues to be wonderful: This thing ran on the first attempt after compiling for the first time.

next step would be to eliminate the py parameter and use Bound values instead.

@LilyFoote
Copy link
Contributor Author

That's an awesome coincidence! I also like the look of some of your improvements.

One small thing to note is that ProfileEvent::TraceEvent perhaps shouldn't actually wrap TraceEvent, since the Exception, Line and Opcode variants aren't used at all by profiling.

I'm also not certain we need to make any code generic over Event - there's only ever one Event enum that makes sense for each context.

@flying-sheep
Copy link

Thanks, I didn’t realize that the events have mostly disjoint types!

I'm also not certain we need to make any code generic over Event - there's only ever one Event enum that makes sense for each context.

hm, how do I get code reuse then? I’m gunning for these ergonomics here:

#[pyclass]
struct MyTracer {
    ...state
}

impl tracer::Tracer<tracer::TraceEvent> for MyTracer {
    fn trace(&mut self, frame: Py<PyFrame>, event: tracer::TraceEvent, py: Python) -> PyResult<()> {
        ...
    }
}

fn main() {
    let tracer = Tracer { ... };
    // TODO: figure out how to do threads
    Python::with_gil(|py| Bound::new(py, tracer)?.register())?;
}

and being able to simply replace tracer::TraceEvent with tracer::ProfileEvent and getting different behavior.

@LilyFoote
Copy link
Contributor Author

My instinct is that profile and trace are sufficiently different that code re-use might be an attractive nuisance - I'd expect swapping from one to another to require internal refactoring of MyTracer to the point that it'd be better to have two separate APIs. But maybe my instinct here is inaccurate?

@flying-sheep
Copy link

flying-sheep commented Mar 29, 2024

That’s the idea! pyo3 would provide the trait (I called it Tracer) and the implementation for registering a tracer or a profiler, whereas the user would provide the struct and the trace method to handle whatever tracing or profiling task they want to do.

The boilerplate to implement either would look similar, the struct data and function body very different:

- impl tracer::Tracer<tracer::TraceEvent> for MyTracer {
+ impl tracer::Tracer<tracer::ProfileEvent> for MyProfiler {
-    fn trace(&mut self, frame: Py<PyFrame>, event: tracer::TraceEvent, py: Python) -> PyResult<()> {
+    fn trace(&mut self, frame: Py<PyFrame>, event: tracer::ProfileEvent, py: Python) -> PyResult<()> {
         ...
     }
 }

@davidhewitt
Copy link
Member

Having thought a little about this, I'm ok to see this added. Seems like at least both of you would use this API and I think others also contributed the original FFI bindings. Some thoughts:

  • I'd quite like us to have a sense of what Python 3.12's sys.monitoring APIs do to affect this code even if we don't implement for it immediately.
  • I think the enum carrying event data is a good idea.
  • I think generic Tracer might be overkill if we know there are only ever two event types, but I could be persuaded if we think it's necessary
  • It might be worth gating this behind a feature flag, given that these APIs will be extremely powerful for a subgroup of users and irrelevant for many others.

@LilyFoote
Copy link
Contributor Author

Makes sense to me!

Do you have any thoughts about where to put the code? Some ideas:

  • pyo3::wrappers::sys
  • pyo3::introspection
  • pyo3::profiling
  • pyo3::tracing

As for sys.monitoring, my thoughts are that it's different enough to deserve an independent API and therefore doesn't really have implications for this issue. At the python level, we register one callback for each event type we care about. Each of these callbacks has their own bespoke signature.

@davidhewitt
Copy link
Member

How about pyo3::instrumentation? pyo3::introspection is also fine with me. (Both profiling and tracing seem like a misnomer given both use cases exist.)

It seems to me like sys.monitoring's C API is internal and it's only the Python API is exposed, so I think beyond your reasonable objection there's also little point trying to support it in PyO3 for now. https://github.com/python/cpython/blob/c741ad3537193c63fe697a8f0316aecd45eeb9ba/Include/internal/pycore_instruments.h#L63

So 👍 from me to move ahead with just setprofile / settrace for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants