New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #41630: include max_seq_length in cudnn descriptor cache key #41832
Conversation
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google. ℹ️ Googlers: Go here for more info. |
CLAs look good, thanks! ℹ️ Googlers: Go here for more info. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me, but I'll wait for @kaixih to LGTM also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, can you please also add a test case?
I can't find any existing test file supporting what is in |
Managed to add something as a new |
I am wondering if it is possible to limit the test into the python test, like in the lstm_v2_test.py. All we need to do is to design a simple training as in the issue 41630 that uses [1, 2048, 2048, 1, 74, 2, 2048] in the first step and store it to the cache but [1, 2048, 2048, 1, 75, 2, 2048] in the second step. If this can be successfully trained on GPU, we can view it as passed. How do you think? |
I can look into that, it would sounds like a better option than the current test I wrote. |
Do you know if there's any magic to run python tests required? I've followed
|
Have you tried to simply run |
Even simpler: |
Making progresses:
With that, I can assert that it is going through the layers and reaches
However, changing
Even directly calling Forcing eager execution using
And to the best of my knowledge, with current TensorFlow, since there is no more |
Can we hope for a 1.15.4 to include that fix ? |
CC @goldiegadde |
Gentle ping, can we hope for 1.15.4 or should we direct people hitting this issue to use the magic reset state env variable? |
@sanjoy @kaixih @goldiegadde Gentle ping? There has been several commit pushed to r1.15, so it would really be nice to include this one if you are preparing a 1.15.4. I can send the PR if you'd like. |
@lissyx Sorry for the delayed response, can you please open a PR against the 1.15 branch. cc @mihaimaruseac as well. |
Apologies, I missed this myself too. Yes, please, let's open a PR against the branch |
|
No description provided.