Avoiding de-optimization points due to Py_DECREF
and allocation.
#402
Replies: 6 comments
-
The allocator solution would seem simpler than the Regarding DECREF case (2), the recursive Another thought: I guess sometimes specialization for a given type tells us which category a DECREF falls. Other times we'd have to look in the type. The former seems more attractive to experiment with (simpler). |
Beta Was this translation helpful? Give feedback.
-
It's not that complex, but it does require an IR that makes increfs and decrefs visible. FWIW this is the Cinder JIT approach: our HIR (high-level IR) is roughly similar to Python bytecode but slightly lower level. Notably, increfs and decrefs are explicit in HIR, (though we actually insert them automatically in an analysis pass.) Thus we could very easily move all decref operations to the end of a jit-compiled function, and we've considered doing so but so far haven't due to the potential compatibility impact. |
Beta Was this translation helpful? Give feedback.
-
Chiming in to provide a PyPy perspective on the potential compatibility impact of moving object collection. We have fixed many of the problems of libraries doing too much in |
Beta Was this translation helpful? Give feedback.
-
Don't objects that wrap a file descriptor necessarily need to close that file descriptor in their destructor? (Ditto for network or database connections etc.) |
Beta Was this translation helpful? Give feedback.
-
Of course resource closing in a destructor is still required as a last resort defense to prevent resource leaks. However the trend towards using context managers and providing a All this was in an attempt to discuss the potential compatibility impact of "moving all decref operations to the end of a jit-compiled function". |
Beta Was this translation helpful? Give feedback.
-
Here's an outline plan for avoiding running arbitrary code in We need a new flag for While we're changing the interface, we might as well change the whole thing to accept the interpreter as an argument. void Py_DECREF2(PyInterpreter *interp, PyObject *obj)
{
if (--obj->ob_refcnt == 0) {
Py_TYPE(obj)->tp_dealloc2(interp, obj);
}
}
void safe_dealloc_wrapper(PyInterpreter *interp, PyObject *obj)
{
PyList_Append(interp->pending_unsafe_dealloc, obj);
_Py_SetPendingFinalizer(interp); // Sets bit in the eval breaker.
} For classes that need finalizers, but have safe deallocation functions we need a slightly different function. void dealloc_maybe_finalize(PyInterpreter *interp, PyObject *obj)
{
if (NEEDS_FINALIZING(obj)) {
PyList_Append(interp->pending_finalizer_list, obj);
_Py_SetPendingFinalizer(interp); // Sets bit in the eval breaker.
return;
}
/* Do the deallocation here */
} Passing the interpreter to the dealloc function will allow it to efficiently access the relevant freelist, so has no extra cost even for contexts where it is not available. void Py_DECREF(PyObject *obj)
{
if (--obj->ob_refcnt == 0) {
Py_TYPE(obj)->tp_dealloc2(_PyInterpreterState_GET(), obj);
}
}
|
Beta Was this translation helpful? Give feedback.
-
When optimizing a region of code, we want two things:
However, every time we hit a potential call into C code, we need to restore the VM state and throw away all our information.
This means that any
Py_DECREF
or allocation forces expensive de-optimization, as it can potentially call arbitrary code.This is bad, so how can we fix this?
Py_DECREF
We have a couple of options to prevent
Py_DECREF
causing excessive de-optimization._Py_Dealloc
safe, by deferring finalizers and untrusted deallocator functions.Making
_Py_Dealloc
safe.We can classify objects into three groups:
Py_DECREF
the references to some other objects and then freeing the memory.By adding a flag (or two) to the type (or object if we get saturating reference counts), we can handle cases 1 and 2 in
_Py_Dealloc
and push the remaining objects to a list to be deallocated later.We need to de-optimize whenever we check for interrupts and the like, so
_Py_Dealloc
can indicate that there are objects that need complex deallocation by setting a bit in theeval_breaker
variable ensuring reasonable prompt de-allocation without hurting optimization.Allocation
This can be handled much like
_Py_Dealloc
. Instead of calling the cycle GC in the allocator, we set a bit in theeval_breaker
variable and call the cycle GC when safe to do so.Beta Was this translation helpful? Give feedback.
All reactions