Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DOC] Document memory resource lifetime requirements when taking ownership of device buffers in Python #1492

Closed
wence- opened this issue Mar 4, 2024 · 3 comments · Fixed by #1552
Assignees
Labels
doc Documentation

Comments

@wence-
Copy link
Contributor

wence- commented Mar 4, 2024

An rmm::device_buffer holds a (effectively) raw pointer to the memory resource used for its allocation and deallocation. On the C++ side, it is the responsibility of the constructor a device_buffer to ensure that the memory resource lives at least as long as the buffer.

When we expose a device_buffer to Python through the cython DeviceBuffer class (specifically, when we use DeviceBuffer.c_from_unique_ptr which is used in many places in cudf), we have no way of inspecting the provenance of the memory resource used to allocate that buffer, so we have no way of keeping it alive.

Right now, the way we handle this in the cython wrappers is to throw our hands up in the air, grab the current_device_resource, cross our fingers and hope.

This has worked so far because usually no-one switches out a memory resource from under us, and generally speaking in cudf we're taking ownership of buffers that have ultimately been allocated on the python side, but it's not actually safe.

Unless we're willing to make the memory resource slot in a device buffer hold a smart pointer, and I believe there are good reasons why we do not do that at the moment, the best we can do is document the requirements on the Python side and then obey them when using RMM from cudf.

@vyasr
Copy link
Contributor

vyasr commented Mar 4, 2024

For a cudf-specific example and discussion, see rapidsai/cudf#14229

@jrhemstad
Copy link
Contributor

jrhemstad commented Mar 4, 2024

When we expose a device_buffer to Python through the cython DeviceBuffer class (specifically, when we use DeviceBuffer.c_from_unique_ptr which is used in many places in cudf), we have no way of inspecting the provenance of the memory resource used to allocate that buffer

You can't inspect the provenance after the fact, no, but you control what resource is used to construct the buffer in the first place, yes?

A sketch of what I mean:

struct buffer_that_owns_resource{
   device_buffer b_;
   std::shared_ptr<device_memory_resource> mr_;
};


std::shared_ptr<device_memory_resource> mr{...};
device_buffer b = cudf::some_function(..., mr.get());

auto owning = buffer_that_owns_resource{std::move(b), mr};

This enables you to have the effective equivalent of if device_buffer owned its mr.

@wence-
Copy link
Contributor Author

wence- commented Mar 5, 2024

You can't inspect the provenance after the fact, no, but you control what resource is used to construct the buffer in the first place, yes?

We do, although we don't currently use what you propose (or a moral equivalent thereof) in cudf-python for (I presume) historical reasons.

However, since we're continuing to advocate for rmm as a cross-library allocator, we need to make these requirements (I think) clearer in the docs than they are currently.

@wence- wence- removed the ? - Needs Triage Need team to review and classify label May 1, 2024
wence- added a commit to wence-/rmm that referenced this issue May 3, 2024
On the C++ side, device_buffers store raw pointers for the memory
resource that was used in their allocation. Consequently, it is unsafe
to take ownership of a device_buffer in Python unless we controlled
the provenance of the memory resource that was used in its allocation.
The only way to do that is to pass the memory resource from Python
into C++ and then use it when constructing the DeviceBuffer.

Add discussion of this with some examples and a section on pitfalls
if only relying on get_current_device_resource and
set_current_device_resource.

- Closes rapidsai#1492
wence- added a commit to wence-/rmm that referenced this issue May 3, 2024
On the C++ side, device_buffers store raw pointers for the memory
resource that was used in their allocation. Consequently, it is unsafe
to take ownership of a device_buffer in Python unless we controlled
the provenance of the memory resource that was used in its allocation.
The only way to do that is to pass the memory resource from Python
into C++ and then use it when constructing the DeviceBuffer.

Add discussion of this with some examples and a section on pitfalls
if only relying on get_current_device_resource and
set_current_device_resource.

- Closes rapidsai#1492
wence- added a commit to wence-/rmm that referenced this issue May 3, 2024
On the C++ side, device_buffers store raw pointers for the memory
resource that was used in their allocation. Consequently, it is unsafe
to take ownership of a device_buffer in Python unless we controlled
the provenance of the memory resource that was used in its allocation.
The only way to do that is to pass the memory resource from Python
into C++ and then use it when constructing the DeviceBuffer.

Add discussion of this with some examples and a section on pitfalls
if only relying on get_current_device_resource and
set_current_device_resource.

- Closes rapidsai#1492
rapids-bot bot pushed a commit that referenced this issue May 16, 2024
…hip requirements in Python/C++ interfacing (#1552)

On the C++ side, device_buffers store raw pointers for the memory
resource that was used in their allocation. Consequently, it is unsafe
to take ownership of a device_buffer in Python unless we controlled
the provenance of the memory resource that was used in its allocation.
The only way to do that is to pass the memory resource from Python
into C++ and then use it when constructing the DeviceBuffer.

Add discussion of this with some examples and a section on pitfalls
if only relying on get_current_device_resource and
set_current_device_resource.

To allow Python users of `DeviceBuffer` objects to follow best practices,
introduce explicit (defaulting to the current device resource) `mr` 
arguments in both `c_from_unique_ptr` and the `DeviceBuffer` constructor.

- Closes #1492

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #1552
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doc Documentation
Projects
Status: Done
3 participants