Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory usage of TensorFlow models #2173

Open
ndeepesh opened this issue Aug 16, 2023 · 7 comments
Open

Memory usage of TensorFlow models #2173

ndeepesh opened this issue Aug 16, 2023 · 7 comments
Assignees

Comments

@ndeepesh
Copy link

We want to monitor memory usage of TensorFlow serving runtime on a per model basis. Currently we can get the memory used by Tensorflow completely but we dont have a way to get this information on a per model basis. We would like to have a solution that can work for both CPU and GPU memory

No alternatives are available that we know of

This is a similar issue to this one - #2156 where I wanted to follow up on whether there are any solutions for CPUs

@ndeepesh
Copy link
Author

@singhniraj08 Any pointers on above?

@ndeepesh
Copy link
Author

Hey @singhniraj08 Can you point us to any workarounds in the meantime?

@singhniraj08
Copy link

@ndeepesh, Can you try using Memory profiling tool if that helps. Since the models are running online on c++ model servers, tracking memory usage becomes difficult in that case.

I am bumping up this issue internally for solution and we will update this thread. Thank you!

@ndeepesh
Copy link
Author

ndeepesh commented Aug 23, 2023

Thanks @singhniraj08. Without a good estimate this is causing regressions in our hosts where we are going OOM easily with no way to tell which model is the issue

@ndeepesh
Copy link
Author

@singhniraj08 Will the memory profile tool gets populated for profiling on CPUs? I havent see this getting populated on cpu profiles

@singhniraj08 singhniraj08 self-assigned this Aug 28, 2023
@singhniraj08
Copy link

@ndeepesh, Memory profiling tools monitors the memory usage of your device during the profiling interval. If you are looking for CPU usage while running TF Serving, Profiling inference requests with Tensorboard may help you achieve that. Thanks.

@ndeepesh
Copy link
Author

ndeepesh commented Sep 2, 2023

@singhniraj08 This only gives us what tensorflow op is taking how much time right? Not the amount of memory it occupies on CPU.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants