Memory usage of TensorFlow models #2173

ndeepesh · 2023-08-16T05:01:57Z

We want to monitor memory usage of TensorFlow serving runtime on a per model basis. Currently we can get the memory used by Tensorflow completely but we dont have a way to get this information on a per model basis. We would like to have a solution that can work for both CPU and GPU memory

No alternatives are available that we know of

This is a similar issue to this one - #2156 where I wanted to follow up on whether there are any solutions for CPUs

ndeepesh · 2023-08-21T00:22:40Z

@singhniraj08 Any pointers on above?

ndeepesh · 2023-08-22T02:13:03Z

Hey @singhniraj08 Can you point us to any workarounds in the meantime?

singhniraj08 · 2023-08-22T09:30:19Z

@ndeepesh, Can you try using Memory profiling tool if that helps. Since the models are running online on c++ model servers, tracking memory usage becomes difficult in that case.

I am bumping up this issue internally for solution and we will update this thread. Thank you!

ndeepesh · 2023-08-23T01:09:11Z

Thanks @singhniraj08. Without a good estimate this is causing regressions in our hosts where we are going OOM easily with no way to tell which model is the issue

ndeepesh · 2023-08-25T17:22:44Z

@singhniraj08 Will the memory profile tool gets populated for profiling on CPUs? I havent see this getting populated on cpu profiles

singhniraj08 · 2023-08-28T08:54:34Z

@ndeepesh, Memory profiling tools monitors the memory usage of your device during the profiling interval. If you are looking for CPU usage while running TF Serving, Profiling inference requests with Tensorboard may help you achieve that. Thanks.

ndeepesh · 2023-09-02T17:28:07Z

@singhniraj08 This only gives us what tensorflow op is taking how much time right? Not the amount of memory it occupies on CPU.

singhniraj08 self-assigned this Aug 17, 2023

singhniraj08 added the type:feature label Aug 17, 2023

singhniraj08 mentioned this issue Aug 17, 2023

Monitor memory usage of TensorFlow models #2156

Closed

singhniraj08 assigned xiaoxlu and unassigned singhniraj08 Aug 18, 2023

singhniraj08 added stat:awaiting response stat:awaiting tensorflower and removed stat:awaiting response labels Aug 18, 2023

singhniraj08 self-assigned this Aug 28, 2023

singhniraj08 removed their assignment Aug 28, 2023

singhniraj08 added stat:awaiting response and removed stat:awaiting tensorflower labels Aug 28, 2023

google-ml-butler bot removed the stat:awaiting response label Sep 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory usage of TensorFlow models #2173

Memory usage of TensorFlow models #2173

ndeepesh commented Aug 16, 2023

ndeepesh commented Aug 21, 2023

ndeepesh commented Aug 22, 2023

singhniraj08 commented Aug 22, 2023

ndeepesh commented Aug 23, 2023 •

edited

ndeepesh commented Aug 25, 2023

singhniraj08 commented Aug 28, 2023

ndeepesh commented Sep 2, 2023

Memory usage of TensorFlow models #2173

Memory usage of TensorFlow models #2173

Comments

ndeepesh commented Aug 16, 2023

ndeepesh commented Aug 21, 2023

ndeepesh commented Aug 22, 2023

singhniraj08 commented Aug 22, 2023

ndeepesh commented Aug 23, 2023 • edited

ndeepesh commented Aug 25, 2023

singhniraj08 commented Aug 28, 2023

ndeepesh commented Sep 2, 2023

ndeepesh commented Aug 23, 2023 •

edited