Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple terraform script always causes grpc: error while marshaling: string field contains invalid UTF-8 #17563

Closed
1 task done
xx364 opened this issue Jul 8, 2022 · 18 comments
Labels

Comments

@xx364
Copy link

xx364 commented Jul 8, 2022

Is there an existing issue for this?

  • I have searched the existing issues

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform Version

1.2.4

AzureRM Provider Version

3.13.0

Affected Resource(s)/Data Source(s)

azurerm_resource_group

Terraform Configuration Files

terraform {
  required_providers {
    azurerm = {
      source = "hashicorp/azurerm"
      version = "=3.13.0"
    }
  }
}

provider "azurerm" {
  features {}
}

# Create a resource group
resource "azurerm_resource_group" "my" {
    name     = "rgtest"
    location = "westeurope"
}

Debug Output/Panic Output

2022-07-08T17:56:33.002+0200 [TRACE] provider.terraform-provider-azurerm_v3.13.0_x5.exe: Calling downstream: @caller=github.com/hashicorp/terraform-plugin-go@v0.5.0/tfprotov5/tf5server/server.go:386 @module=sdk.proto tf_provider_addr=provider tf_req_id=69b4666b-dc3d-a7a0-39d9-966555e7bc7c tf_proto_version=5 tf_rpc=Configure timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.002+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if Service Principal / Client Certificate is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.003+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if Multi Tenant Service Principal / Client Secret is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.003+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if Service Principal / Client Secret is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.003+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if OIDC is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.003+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if Managed Service Identity is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.003+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if Obtaining a Multi-tenant token from the Azure CLI is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.004+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Testing if Obtaining a token from the Azure CLI is applicable for Authentication..: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:33.004+0200 [INFO]  provider.terraform-provider-azurerm_v3.13.0_x5.exe: Using Obtaining a token from the Azure CLI for Authentication: timestamp=2022-07-08T17:56:33.002+0200
2022-07-08T17:56:35.950+0200 [TRACE] provider.terraform-provider-azurerm_v3.13.0_x5.exe: Called downstream: tf_proto_version=5 tf_provider_addr=provider tf_req_id=69b4666b-dc3d-a7a0-39d9-966555e7bc7c tf_rpc=Configure @module=sdk.proto @caller=github.com/hashicorp/terraform-plugin-go@v0.5.0/tfprotov5/tf5server/server.go:392 timestamp=2022-07-08T17:56:35.950+0200
2022-07-08T17:56:35.950+0200 [ERROR] plugin.(*GRPCProvider).ConfigureProvider: error="rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8"
2022-07-08T17:56:35.951+0200 [TRACE] provider.terraform-provider-azurerm_v3.13.0_x5.exe: Served request: @module=sdk.proto tf_provider_addr=provider tf_req_id=69b4666b-dc3d-a7a0-39d9-966555e7bc7c tf_rpc=Configure @caller=github.com/hashicorp/terraform-plugin-go@v0.5.0/tfprotov5/tf5server/server.go:398 tf_proto_version=5 timestamp=2022-07-08T17:56:35.950+0200
2022-07-08T17:56:35.951+0200 [DEBUG] provider.terraform-provider-azurerm_v3.13.0_x5.exe: ERROR: 2022/07/08 17:56:35 [core] grpc: server failed to encode response:  rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8
2022-07-08T17:56:35.951+0200 [ERROR] vertex "provider[\"registry.terraform.io/hashicorp/azurerm\"]" error: Plugin error
2022-07-08T17:56:35.952+0200 [TRACE] vertex "provider[\"registry.terraform.io/hashicorp/azurerm\"]": visit complete, with errors

Expected Behaviour

terraform plan should return without error

Actual Behaviour

│ Error: Plugin error

│ with provider["registry.terraform.io/hashicorp/azurerm"],
│ on test.tf line 10, in provider "azurerm":
│ 10: provider "azurerm" {

│ The plugin returned an unexpected error from plugin.(*GRPCProvider).ConfigureProvider: rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8

Steps to Reproduce

terraform plan

Important Factoids

Running on Windows 10, script was a newly written file with current version of Visual Studio Code

References

No response

@xx364 xx364 added the bug label Jul 8, 2022
@github-actions github-actions bot removed the bug label Jul 8, 2022
@apparentlymart
Copy link
Member

apparentlymart commented Jul 9, 2022

Hi @xx364!

I'm not a maintainer of this provider so I'm just commenting here with some hopefully-helpful additional context from the perspective of Terraform Core and the provider plugin protocol.

This one is a bit puzzling to me because it seems to be originating in the guts of the gRPC client or server. The trace log seems to suggest that the request is at least making it into our own SDK codepaths, since we can see it reach a log line in terraform-plugin-go. That, combined with the fact that the message says "marshal" rather than "unmarshal", leads me to conclude that this error is arising when the gRPC server inside the provider is serializing the response to return to Terraform Core, rather than when the gRPC client in Terraform Core is serializing the request.

However, the only string values in the ConfigureProvider response for protocol version 5 are in the Diagnostic message, used for reporting errors and warnings. Those are converted to the protobuf message types and then returned to the caller, which is presumably the gRPC server code itself, inside the gRPC library. That led me to the place where this error message seems to originate. I think this is wrapping errInvalidUTF8 from the protobuf library, which seems to be used in some generated code that I didn't yet try to chase down.

All of this then has led me to wonder if there are circumstances where the Azure provider might return a diagnostic message whose summary or detail contains non-UTF-8 bytes, which none of our SDK layers catch but which get ultimately rejected by the protobuf codec before getting sent over the wire. Or maybe it is echoing back verbatim some error text generated by the underlying Azure SDK that itself isn't valid UTF-8?

That was as far as I was able to delve with the codebases I'm familiar with. I hope it's useful in further investigation!

(I wonder if it would be viable to include some extra checks in the protocol translation functions in terraform-plugin-go so that we could catch this problem a little earlier and hopefully be able to return a little more context about what exactly is invalid. That would amount to doing UTF-8 validation twice in the return path though, so I suppose we'd want to limit it only to string fields we expect to be relatively small. 🤔 )

@vnyaryan

This comment was marked as duplicate.

@horihel
Copy link

horihel commented Jul 15, 2022

we've been hit by the same problem, but could solve it by upgrading azure CLI to the latest version.

As the trace indicates that azurerm is calling out to Azure CLI just before the problem occurs, comparing AZcli versions might be useful.

We had it fail on azure-cli 2.27.2, upgrading to 2.38.0 fixed the problem.

@timmkrause
Copy link

timmkrause commented Aug 5, 2022

Experiencing the same issue with Azure CLI 2.39.0.
Downgraded to 2.38.0, same thing.

No difference between Windows Terminal (PowerShell) and CMD shell as well.

No clue what is going on.

@ananthayyaswamy
Copy link

ananthayyaswamy commented Aug 11, 2022

try after logging into
azure.
it worked for me!

@timmkrause
Copy link

timmkrause commented Aug 11, 2022

I was logged in (az account show) and still had the problem. In either case a good error mesage would help...

@marcelltoth
Copy link

I got the same problem today, hope this helps someone:

I was logged in per az account show, yet somehow logging in again (az login) fixed the problem.

@zeesyed
Copy link

zeesyed commented Nov 17, 2022

same issue today. @marcelltoth's comment resolved it for me.

@apparentlymart
Copy link
Member

Hi all,

I believe that provider versions v3.14.0 and later should include additional log messages in the debug output explaining what the provider was trying to return when it encountered this error. If anyone in this discussion can reproduce the problem on a newer version of the provider which has that additional logging, it may help to re-share the updated debug logs (equivalent of what's under "Debug Output/Panic Output" in the original issue above) which will hopefully expose the real underlying problem that is currently being masked by a protocol encoding error.

To obtain the debug output, see the Terraform documentation on debugging.

@chaubeyrahul391
Copy link

I got the same problem today, hope this helps someone:

I was logged in per az account show, yet somehow logging in again (az login) fixed the problem.

Thank you

@willdave865
Copy link

willdave865 commented Jan 25, 2023

I'm new to terraform so I don't know if this helps. But. I was getting the same error so came to this thread. I'm doing Microsoft tutorial

https://learn.microsoft.com/en-us/azure/developer/terraform/create-resource-group?tabs=azure-cli

Exactly same files/code. I forced the provider version in providers.tf to source = "hashicorp/azurerm" version = "3.40.0" Ran terraform init -upgrade Output as follows (note no additional content to the original error message)
Initializing provider plugins...
Finding hashicorp/random versions matching "~> 3.0"...
Finding hashicorp/azurerm versions matching "3.40.0"...
Using previously-installed hashicorp/random v3.4.3
Installing hashicorp/azurerm v3.40.0...
Installed hashicorp/azurerm v3.40.0 (signed by HashiCorp)

Error message:
terraform plan -out main.tfplan ╷ │ Error: Plugin error │ │ with provider["registry.terraform.io/hashicorp/azurerm"], │ on providers.tf line 20, in provider "azurerm": │ 20: provider "azurerm" { │ │ The plugin returned an unexpected error from plugin.(*GRPCProvider).ConfigureProvider: rpc error: code = Internal desc = grpc: error while marshaling: string field contains invalid UTF-8

Line 20 is provider "azurerm" { The beginning of an empty features block.

@apparentlymart
Copy link
Member

Hi @willdave865,

Could you try running that terraform plan command again with the TF_LOG=trace environment variable set, and then share the verbose logging you see in a GitHub Gist? (It'll typically be too long to include directly in a comment.)

I heard that there's some additional logging in recent versions of the provider that might help explain what real error is hiding behind this encoding error. The original writeup for this issue includes part of a trace log captured with an older version of the provider that didn't have the new log lines yet, and so seeing similar logs with the newer version you are running might give some new information about what's going on here.

Thanks!

@willdave865
Copy link

Hi @apparentlymart
Please find requested gist at https://gist.github.com/willdave865/8a0c82a6e40428801ba9c5834a05dbba

Hope this is of some help.

@apparentlymart
Copy link
Member

Thanks @willdave865!

Indeed, this trace log seems to contain exactly what I was hoping for:

2023-01-25T16:13:58.568+1300 [ERROR] provider.terraform-provider-azurerm_v3.40.0_x5.exe: Response contains error diagnostic: diagnostic_severity=ERROR tf_proto_version=5.3 tf_req_id=7fbc86ea-79cb-b0e2-0baf-ffa6c15c7081 @caller=github.com/hashicorp/terraform-plugin-go@v0.14.1/tfprotov5/internal/diag/diagnostics.go:55 diagnostic_detail= diagnostic_summary="building account: getting authenticated object ID: parsing json result from the Azure CLI: waiting for the Azure CLI: exit status 1: ERROR: AADSTS700082: The refresh token has expired due to inactivity.´┐¢The token was issued on 2021-12-13T17:51:05.8909751Z and was inactive for 90.00:00:00.

Trace ID: 29717cc9-5100-4cee-9720-60dbb8397900

Correlation ID: 56b2268d-faba-4cbd-841e-5e2f85d03f86

Timestamp: 2023-01-25 03:13:58Z
To re-authenticate, please run:
az login --scope https://graph.windows.net//.default" tf_provider_addr=provider tf_rpc=Configure @module=sdk.proto timestamp=2023-01-25T16:13:58.567+1300

This shows the error that the provider was trying to return, which as I suspected seems to just be passed on verbatim from the underlying SDK.

And inside that error message there's a sequence which my browser interpreted as the Unicode characters ´┐¢ but I assume that in memory was a non-UTF8 sequence of bytes, and so that's why the provider ended up accidentally returning a message that isn't valid for the plugin wire protocol, and thus got this generic error message instead.

We can also see here why some other folks above were able to get past this error message by running az login: that's exactly the next step this error message was trying to propose.


I notice that the latest release of the Azure provider at the time I'm writing this message (v3.40.0) is using the underlying library terraform-plugin-go at version v0.14.1 and so doesn't yet have the change I made in hashicorp/terraform-plugin-go#237 to try to make invalid messages like this still pass through to Terraform Core to show in the UI.

Given that this error seems correct -- the refresh token has expired -- and given that the exact text of this message seems outside the direct control of the Azure provider, my sense is that the answer here could be to upgrade terraform-plugin-go to v0.14.2 to get the benefit of that fix, and then hopefully the Azure provider should be able to return this slightly-invalid error message to Terraform Core and have it display correctly aside from replacing the ´┐¢ sequence with unicode replacement characters, ���, just so that it'll still be valid UTF-8 to satisfy the wire protocol.

As I mentioned back at the top, I'm not a member of this provider's development team so I'll leave the final call on this up to that team but I hope the above diagnosis is helpful!

@willdave865
Copy link

Just thought I'd point out that there have been other bug reports opened for this issue #13182, #12960. These predates this one by almost a year.

@rocky02441
Copy link

Hi Devs,

I just logout from azure cli then logged in again as mentioned below & It worked fine. for me

https://stackoverflow.com/questions/68973849/how-to-get-round-plugin-errors-when-deploying-azure-resources-using-terraform

@apparentlymart
Copy link
Member

apparentlymart commented Apr 13, 2023

I notice that #20114 upgraded the "terraform-plugin-go" library as a side-effect of its other changes, and so according to the release message in that PR the error message should now be correct from v3.41.0 onwards. I would expect it to now report that the token has expired, rather than reporting that the provider's error message is not encoded correctly.

However, I've not verified it because I don't work routinely with Azure and so I don't know how to set up the situation where this error message appears. If someone else could try it and report back then that could be helpful to confirm whether this is fixed or whether there's still something more to do.

@rcskosir
Copy link
Contributor

Thanks for taking the time to submit this issue. It looks like this has been resolved as of #20114. As such, I am going to mark this issue as closed.
If you believe it is not resolved, please share your version, config, and steps to reproduce, and I can reopen this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests