Why TF Serving using one CUDA Compute Stream #2221

ndeep27 · 2024-05-06T22:36:21Z

Trying to understand why TF uses one CUDA compute stream? Is there a metric which shows if ops are waiting to be scheduled on that one compute stream? I want to understand if the ops are waiting in high QPS scenarios

singhniraj08 · 2024-05-08T08:53:39Z

@ndeep27,
Looks like this is not an issue from Tensorflow Serving side. This question is better asked on TensorFlow Forum since it is not a bug or feature request. There is also a larger community that reads questions there. Thank you!

github-actions · 2024-05-16T01:49:47Z

This issue has been marked stale because it has no recent activity since 7 days. It will be closed if no further activity occurs. Thank you.

github-actions · 2024-05-24T01:49:56Z

This issue was closed due to lack of activity after being marked stale for past 7 days.

google-ml-butler · 2024-05-24T01:49:58Z

Are you satisfied with the resolution of your issue?
Yes
No

singhniraj08 self-assigned this May 7, 2024

singhniraj08 added the type:support label May 7, 2024

singhniraj08 added the stat:awaiting response label May 8, 2024

github-actions bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 16, 2024

github-actions bot closed this as completed May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why TF Serving using one CUDA Compute Stream #2221

Why TF Serving using one CUDA Compute Stream #2221

ndeep27 commented May 6, 2024

singhniraj08 commented May 8, 2024

github-actions bot commented May 16, 2024

github-actions bot commented May 24, 2024

google-ml-butler bot commented May 24, 2024

Why TF Serving using one CUDA Compute Stream #2221

Why TF Serving using one CUDA Compute Stream #2221

Comments

ndeep27 commented May 6, 2024

singhniraj08 commented May 8, 2024

github-actions bot commented May 16, 2024

github-actions bot commented May 24, 2024

google-ml-butler bot commented May 24, 2024