Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Spanner] Server randomly returns "ServerException: INTERNAL: Received RST_STREAM with error code 2". #5473

Closed
taka-oyama opened this issue Aug 26, 2022 · 7 comments · Fixed by #5938
Labels
type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@taka-oyama
Copy link
Contributor

taka-oyama commented Aug 26, 2022

Here is a more detailed error.

Google\Cloud\Core\Exception\ServerException: {
    "message": "Received RST_STREAM with error code 2",
    "code": 13,
    "status": "INTERNAL",
    "details": []
} in /project/vendor/google/cloud-core/src/GrpcRequestWrapper.php:257
Stack trace:
#0 /project/vendor/google/cloud-core/src/GrpcRequestWrapper.php(194): Google\Cloud\Core\GrpcRequestWrapper->convertToGoogleException(Object(Google\ApiCore\ApiException))
#1 [internal function]: Google\Cloud\Core\GrpcRequestWrapper->handleStream(Object(Google\ApiCore\ServerStream))
#2 /project/vendor/google/cloud-spanner/src/Result.php(191): Generator->valid()
#3 [internal function]: Google\Cloud\Spanner\Result->Google\Cloud\Spanner\{closure}()
#4 /project/vendor/google/cloud-core/src/ExponentialBackoff.php(80): call_user_func_array(Object(Closure), Array)
#5 /project/vendor/google/cloud-spanner/src/Result.php(192): Google\Cloud\Core\ExponentialBackoff->execute(Object(Closure))
#6 [internal function]: Google\Cloud\Spanner\Result->rows()
....

We have been seeing this error for a few weeks now across various projects running various versions of google/cloud-spanner (including one that is running the latest v1.51.2).

I have not been able to reproduce this error since it happens randomly.

When the error occurs, it shows up in bulk within a span of a few seconds arcoss different pods on K8s.
This error seems to always occur at the first query within a transaction.

Would it be possible to add a retry for this specific error here?

I'm suggesting this because google-cloud-go seems to be doing something similar.

I usually don't post issues until I have reproducible code but this has been affecting production for weeks, so I am eager to get some kind of solution to mitigate the error.

Also, does anyone here know what "error code 2" is?
Understanding it might help to better understand the error.

Thanks.

Environment details

  • OS: Alpine Linux 3.15
  • PHP version: 8.0.20
  • Package name and version:
    • google/laravel-spanner 1.51.2
    • grpc 1.48.0
    • protobuf 3.21.2
@taka-oyama taka-oyama changed the title [Spanner] Server returns "ServerException: INTERNAL: Received RST_STREAM with error code 2" randomly. [Spanner] Server randomly returns "ServerException: INTERNAL: Received RST_STREAM with error code 2". Aug 26, 2022
@taka-oyama
Copy link
Contributor Author

I've contacted support and was informed that the Spanner team is aware of the issue and is working towards a fix.

Closing it for now.

@taka-oyama
Copy link
Contributor Author

Unfortunately, this was not completely addressed and we got a WONT FIX response from google support team.

So I believe this error is here to stay.

Would it be possible to add this error to the auto-retry mechanism below?

if (!($e instanceof AbortedException)) {
throw $e;
}

Go's library already added a retry for all internal server errors recently (probably for the same reason).
googleapis/google-cloud-go#6699

@taka-oyama taka-oyama reopened this Oct 3, 2022
@bshaffer bshaffer added the type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. label Dec 13, 2022
@bshaffer
Copy link
Contributor

bshaffer commented Dec 13, 2022

This seems to be a feature request to retry certain error responses. This behavior is in the works!

See googleapis/google-auth-library-php#359

@taka-oyama
Copy link
Contributor Author

Thank you! Hope to see that get merged soon!

@taka-oyama
Copy link
Contributor Author

Java client added a fix to this as well.

googleapis/java-spanner#2111

@zeriyoshi
Copy link

I too hope this response is incorporated ASAP.

@vishwarajanand
Copy link
Contributor

We are clarifying internally whether (easy to fix) simply adding a retry in existing call is sufficient or (will take longer) we need to re-create a connection.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants