Memory leak when jit compiling #66434

andremfreitas · 2024-04-25T09:50:59Z

Issue type

Bug

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.16

Custom code

Yes

OS platform and distribution

Linux Ubuntu 22.04.3 LTS

Mobile device

No response

Python version

3.10

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

I have a function that two functions that are jit compiled. I call them in my training loop function setp (which is not jit compiled, simply in graph mode). With these two functions jit compiled I get a registers spilled warning as follows:

2024-04-24 12:58:52.550355: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function '__cuda_sm20_div_rn_f64_full', 8 bytes spill stores, 8 bytes spill loads ptxas warning : Registers are spilled to local memory in function '__cuda_sm20_div_rn_f64_full', 8 bytes spill stores, 8 bytes spill loads

This makes the training much slower.
Without jit compiling those two functions I don't get this warning.

My problem is similar to the one in this forum thread. In this thread some user redirects to this open issue #56423. However this one relates this problem to the distribution strategy. In my case I am just using one GPU. In this issue someone, mentioned that if they performed the optimizer step outside of the jit function it worked. In my case, as I said above, my training step function is not jit compiled and so the optimizer step is already not inside a jit compiled function.

Standalone code to reproduce the issue

Unfortunately, I cannot build a MWE at the moment.

Relevant log output

No response

The text was updated successfully, but these errors were encountered:

sushreebarsa · 2024-05-07T08:46:07Z

@andremfreitas The warnings you're seeing indicate that your two jit-compiled functions, likely containing some floating-point division (__cuda_sm20_div_rn_f64_full), are running out of available registers on your GPU. Could you please provide a standalone code to replicate the issue reported here?
Thank you!

google-ml-butler bot added the type:bug Bug label Apr 25, 2024

google-ml-butler bot assigned sushreebarsa Apr 25, 2024

sushreebarsa added comp:ops OPs related issues TF 2.16 labels May 6, 2024

sushreebarsa added the stat:awaiting response Status - Awaiting response from author label May 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak when jit compiling #66434

Memory leak when jit compiling #66434

andremfreitas commented Apr 25, 2024

sushreebarsa commented May 7, 2024

Memory leak when jit compiling #66434

Memory leak when jit compiling #66434

Comments

andremfreitas commented Apr 25, 2024

Issue type

Have you reproduced the bug with TensorFlow Nightly?

Source

TensorFlow version

Custom code

OS platform and distribution

Mobile device

Python version

Bazel version

GCC/compiler version

CUDA/cuDNN version

GPU model and memory

Current behavior?

Standalone code to reproduce the issue

Relevant log output

sushreebarsa commented May 7, 2024