New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surface TLS errors to RPC errors #4163
Comments
Thanks for filing the issue. It should be better to return this error to the users.
When this happens, I would think the connection creation would fail due to a handshake error: grpc-go/internal/transport/http2_client.go Line 243 in 504caa9
If the error happens here, it will be surfaced to the users. |
It does not fail on grpc-go/internal/transport/http2_client.go Line 345 in 504caa9
It fails on the server - As you can see here, when client certificate is bad, server sends an alert to the client: grpc-go/internal/transport/http2_client.go Lines 1310 to 1314 in 504caa9
|
Yes, it would be good if we can surface this error (and other non-generic connection errors) to users via RPC errors (instead of just logs, as we've mostly historically done). Possibly by passing an grpc-go/internal/transport/http2_client.go Line 871 in 504caa9
However, for this particular situation, I don't think this will be enough - RPCs may not have been dispatched to this transport yet. To propagate this to future RPCs (assuming this is the only connection), we would need to make sure the error is sent to the LB policy. For that, Line 1108 in 504caa9
We may be able to do that by making the Line 1296 in 504caa9
There's a little more to this, still: if a connection error happens after we had a READY connection, we will immediately transition back to CONNECTING, so this will be lost. In this situation, we don't have a READY connection yet as the server preface was never received. So, we need to make sure the error given to Lines 1337 to 1339 in 504caa9
All in all, this is a pretty extensive change touching many layers, but should be mostly straightforward. Also, a reasonable end2end test ( @menghanl does that all sound sensible to you? @avalchev94 would you still be interested in attempting something like this, if so? |
@dfawley I am in. However, it would take time, because I need to enter the project. Do you guys have some group chat, where I can ask some questions if I get stuck with the code? If not, is here appropriate place for such communication? In addition, I would like to implement that small feature as well, because I believe it's really important to be able to check the error's type at the end, instead of comparing strings. |
@avalchev94 we don't have a chat for grpc-go that we use regularly; this thread should be fine. If you would like to chat in real time, one option would be to use our gitter channel. We don't usually poll that, so if you want to chat with us there, let us know first so we know to be on the lookout for messages. I'm not sure about the other feature request. I'll have to take a closer look. I'm not sure if that will be feasible, and while these two things are related, they are not dependent on one another. |
@avalchev94 : Are you still interested in working on this? |
This issue is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed. |
With #4311, I don't think there is too much left to do here. |
Hey folks; we use mTLS quite heavily in our organization so we've run into this issue quite a bit as of late. Namely, instead of getting the TLS alert code, we just get It seems like the last piece after #4311 is to propagate the error further to ClientConn and/or LB policy. Is anyone working on this already? If not, I could take a stab at it - though I admit it may take me a while to get used to the guts of the transport/conn code 😄 |
None of us are working on this actively and there is little chance that someone might pick this up this quarter. So. please go ahead and take a shot at it. If you get stuck somewhere, feel free to reach out to us on this thread. |
Hi Everyone. as @anitgandhi reported, we would need the same feature to be available so we can display the TLS error on the client side. Is anyone working on this? I would also like to make a contribution but will take a while to go through it and get used to the code. |
@anitgandhi @srimaln91 |
I tried running our mTLS example where the client establishes an mTLS connection with the server. I used This is how the client logs look:
You can see the following WARNING log on the client:
And the actual error seen by the RPC:
And this is how I run the server and these are the logs:
Looks like we are surfacing the error on the client side and are also throwing logs on both ends. |
As @dfawley mentioned in the other comment, these changes on the client were made possible by #4311. @anitgandhi @srimaln91 @avalchev94 |
Thanks @easwars ; admittedly maybe I should make this a separate issue, but in our org, we the error we see is typically |
What version of gRPC are you running? Can you provide us with a way to reproduce the error that you are seeing? |
It's happening for me on both v1.48.0 and v1.50.1 I'll see if I can make a minimal reproducer program |
@easwars @anitgandhi I already have a simple server, client program written in order to reproduce the issue. Expired certificates are already included and configured in the code. Please find the attached .zip file. Run the server go run cmd/server/main.go Run the client go run cmd/client/main.go |
Thank you @srimaln91 for the repro. What is happening is a little weird and I haven't gotten to the bottom of it yet. In our examples go.mod, we use a replace directive to get the local copy of grpc-go. When I remove the replace directive and make the examples use the latest grpc version, I see the same behavior. I see an error which looks like this:
|
We made some recent changes in how we handle transport creation failures. See: #5731 I tested with your repro, and did the following to update to the most recent commit on the grpc-go repo:
Now, I do see |
@easwars Thanks for the prompt update. I can see the error now.
|
I can confirm that v1.51.0 the issue is resolved for us as well. Thank you so much! |
Wohoo !! Thank you all !! |
Hello, I was trying to check what error was returned from the server when the client's certificate is expired. The error I got was generic:
transport is closing
. At first I was thinking the idea was to not expose the error, for security reasons. However, after some research, I've found that the stdtls
package sends alerts to the client, in that case the alert was:alertBadCertificate
. So far so good, but it turned out that the grpc http2 client disregards that alert and does not expose it to the end user. That's the code where error/alert is disregarded:After debugging, I've found that in that line
frame, err := t.framer.fr.ReadFrame()
alert is received as error, but not used and just the transport is closed, without any more info. Is that on purpose? If not, I am willing to contribute to the project and fix it.The text was updated successfully, but these errors were encountered: