E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_EXECUTION_FAILED #41993

summa-code · 2020-08-03T03:06:28Z

Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template

System information

Have I written custom code (as opposed to using a stock example script provided in TensorFlow): No
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): 20.04
TensorFlow installed from (source or binary): Latest
TensorFlow version (use command below): Latest
Python version: 3.8.1
Bazel version (if compiling from source): 3.1.0
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: 11.0
GPU model and memory: RTX 2070

https://www.tensorflow.org/tutorials/structured_data/time_series

E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(1831): 'cudnnRNNForwardTraining( cudnn.handle(), rnn_desc.handle(), model_dims.max_seq_length, input_desc.handles(), input_data.opaque(), input_h_desc.handle(), input_h_data.opaque(), input_c_desc.handle(), input_c_data.opaque(), rnn_desc.params_handle(), params.opaque(), output_desc.handles(), output_data->opaque(), output_h_desc.handle(), output_h_data->opaque(), output_c_desc.handle(), output_c_data->opaque(), workspace.opaque(), workspace.size(), reserve_space.opaque(), reserve_space.size())'
2020-08-02 22:58:40.548649: W tensorflow/core/framework/op_kernel.cc:1772] OP_REQUIRES failed at cudnn_rnn_ops.cc:1517 : Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 19, 32, 1, 24, 32, 32]

Saduf2019 · 2020-08-03T05:06:39Z

@summa-code
Please provide with simple stand alone code or a colab gist with the error for us to analyse the issue.

applied-machinelearning · 2020-08-03T15:21:08Z

As LSTM is used with CUDA, it is probably this: #41630 , you can try the workaround with the environment variable TF_CUDNN_RESET_RND_GEN_STATE=1

summa-code · 2020-08-03T19:47:13Z

NOPE, it did not work, here is console output

CUDNN_STATUS_EXECUTION_FAILED
in tensorflow/stream_executor/cuda/cuda_dnn.cc(1936): 'cudnnRNNBackwardData( cudnn.handle(), rnn_desc.handle(), model_dims.max_seq_length, output_desc.handles(), output_data.opaque(), output_desc.handles(), output_backprop_data.opaque(), output_h_desc.handle(), output_h_backprop_data.opaque(), output_c_desc.handle(), output_c_backprop_data.opaque(), rnn_desc.params_handle(), params.opaque(), input_h_desc.handle(), input_h_data.opaque(), input_c_desc.handle(), input_c_data.opaque(), input_desc.handles(), input_backprop_data->opaque(), input_h_desc.handle(), input_h_backprop_data->opaque(), input_c_desc.handle(), input_c_backprop_data->opaque(), workspace.opaque(), workspace.size(), reserve_space_data->opaque(), reserve_space_data->size())'

On the notebook:

InternalError: Failed to call ThenRnnBackward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 2, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 19, 32, 1, 24, 32, 32]
[[{{node gradients/CudnnRNN_grad/CudnnRNNBackprop}}]]
[[PartitionedCall]] [Op:__inference_train_function_149648]

Function call stack:
train_function -> train_function -> train_function

Saduf2019 · 2020-08-04T15:21:40Z

@summa-code
This seems to be a duplicate of #41987, please confirm.

summa-code · 2020-08-04T16:42:49Z

Ah !!! Looks like it. My bad.. has been doing few things and did not track what i filed before. Yes same.

Saduf2019 · 2020-08-04T18:17:14Z

Moving to closed status as its a duplicate of #41987

google-ml-butler · 2020-08-04T18:17:16Z

Are you satisfied with the resolution of your issue?
Yes
No

ThinhNgVhust · 2021-10-21T12:57:14Z

TF_CUDNN_RESET_RND_GEN_STATE=1

this is help me

summa-code added the type:bug Bug label Aug 3, 2020

google-ml-butler bot assigned Saduf2019 Aug 3, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Aug 3, 2020

Saduf2019 removed the stat:awaiting response Status - Awaiting response from author label Aug 4, 2020

Saduf2019 added the stat:awaiting response Status - Awaiting response from author label Aug 4, 2020

Saduf2019 closed this as completed Aug 4, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_EXECUTION_FAILED #41993

E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_EXECUTION_FAILED #41993

summa-code commented Aug 3, 2020

Saduf2019 commented Aug 3, 2020

applied-machinelearning commented Aug 3, 2020

summa-code commented Aug 3, 2020 •

edited

Saduf2019 commented Aug 4, 2020

summa-code commented Aug 4, 2020

Saduf2019 commented Aug 4, 2020

google-ml-butler bot commented Aug 4, 2020

ThinhNgVhust commented Oct 21, 2021

E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_EXECUTION_FAILED #41993

E tensorflow/stream_executor/dnn.cc:616] CUDNN_STATUS_EXECUTION_FAILED #41993

Comments

summa-code commented Aug 3, 2020

Saduf2019 commented Aug 3, 2020

applied-machinelearning commented Aug 3, 2020

summa-code commented Aug 3, 2020 • edited

Saduf2019 commented Aug 4, 2020

summa-code commented Aug 4, 2020

Saduf2019 commented Aug 4, 2020

google-ml-butler bot commented Aug 4, 2020

ThinhNgVhust commented Oct 21, 2021

summa-code commented Aug 3, 2020 •

edited