Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Track mrs for to warn during cleanup by reinitialize #1314

Open
vyasr opened this issue Jul 27, 2023 · 0 comments
Open

[FEA] Track mrs for to warn during cleanup by reinitialize #1314

vyasr opened this issue Jul 27, 2023 · 0 comments
Labels
? - Needs Triage Need team to review and classify feature request New feature or request

Comments

@vyasr
Copy link
Contributor

vyasr commented Jul 27, 2023

Is your feature request related to a problem? Please describe.
rmm.reinitialize will clean up any internal references to memory resources prior to recreating new instances. However, it currently has no way to track user-created resources. This means that users who manually create memory resource objects must also delete all references to them in their own code prior to calling reinitialize. This behavior is neither documented nor easily evident, and usually manifests as unexpected heightened total memory consumption.

Describe the solution you'd like
We should add a simple metaclass for DeviceMemoryResource that keeps track of all instances that have been created so far. Then, rmm.reinitialize can check this list of references and warn the user if any references remain that rmm cannot handle cleaning up itself. That will provide users more immediate feedback that something is wrong her.

Describe alternatives you've considered
Rather than warning, we could raise an exception so that users have to fix the issue immediately. It's likely that many users won't notice a warning. OTOH for some use cases it may be acceptable for the old mrs to persist, and erroring is more intrusive.

Alternatively, we could outfit DeviceMemoryResource with the ability to be invalidated in some way so that all outstanding mrs during reinitialize are marked as unusable. Then any future function calls would trigger errors. This approach would require significantly more technical investment, and it's not clear that there's a huge benefit. It would guarantee all memory being returned on reinitialize at the expensive of a more confusing user experience (they would encounter an error potentially long after the reinitialize when they tried to use old mrs).

@vyasr vyasr added feature request New feature or request ? - Needs Triage Need team to review and classify labels Jul 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
? - Needs Triage Need team to review and classify feature request New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant