Memory leak when jit compiling #66434
Labels
comp:ops
OPs related issues
stat:awaiting response
Status - Awaiting response from author
TF 2.16
type:bug
Bug
Issue type
Bug
Have you reproduced the bug with TensorFlow Nightly?
No
Source
source
TensorFlow version
2.16
Custom code
Yes
OS platform and distribution
Linux Ubuntu 22.04.3 LTS
Mobile device
No response
Python version
3.10
Bazel version
No response
GCC/compiler version
No response
CUDA/cuDNN version
No response
GPU model and memory
No response
Current behavior?
I have a function that two functions that are jit compiled. I call them in my training loop function setp (which is not jit compiled, simply in graph mode). With these two functions jit compiled I get a registers spilled warning as follows:
2024-04-24 12:58:52.550355: I tensorflow/stream_executor/gpu/asm_compiler.cc:323] ptxas warning : Registers are spilled to local memory in function '__cuda_sm20_div_rn_f64_full', 8 bytes spill stores, 8 bytes spill loads ptxas warning : Registers are spilled to local memory in function '__cuda_sm20_div_rn_f64_full', 8 bytes spill stores, 8 bytes spill loads
This makes the training much slower.
Without jit compiling those two functions I don't get this warning.
My problem is similar to the one in this forum thread. In this thread some user redirects to this open issue
#56423
. However this one relates this problem to the distribution strategy. In my case I am just using one GPU. In this issue someone, mentioned that if they performed the optimizer step outside of the jit function it worked. In my case, as I said above, my training step function is not jit compiled and so the optimizer step is already not inside a jit compiled function.Standalone code to reproduce the issue
Relevant log output
No response
The text was updated successfully, but these errors were encountered: