hashicorp/tls provider claims it can plan destroy operations but fails when asked to plan destroy of tls_self_signed_cert #31820

apparentlymart · 2022-09-19T23:36:28Z

Terraform v1.3 is intending to introduce a new provider protocol capability where a provider which opts in will be asked to plan the destruction of any of its resource types. Previous versions of Terraform just always unilaterally generated a destroy plan on a provider's behalf, which prevented the provider from failing the plan with an error or generating warnings.

To avoid exposing existing provider implementations to a new situation they weren't designed to deal with, we designed this as an opt-in capability where the provider can report as part of its GetProviderSchema response that it supports the plan_destroy capability, in which case Terraform Core will call PlanResourceChange with a null new value in order to ask a provider to produce a destroy plan for any of its resource types.

Unfortunately it seems that either the hashicorp/tls is already opting in to this capability (even though it hasn't appeared in any Terraform CLI release yet) or Terraform Core is asking the provider to plan its destroy despite the capability not being set. This fails for tls_self_signed_cert because its planning function is not equipped to deal with the proposed new object being null and it fails like this:

2022-09-19T15:56:39.462-0700 [ERROR] provider.terraform-provider-tls_v4.0.2_x5: Response contains error diagnostic: diagnostic_severity=ERROR diagnostic_summary="Config Read Error" tf_req_id=8717c0f7-37c3-7a23-a00a-69385ca86623 tf_resource_type=tls_self_signed_cert tf_rpc=PlanResourceChange @caller=github.com/hashicorp/terraform-plugin-go@v0.14.0/tfprotov5/internal/diag/diagnostics.go:55 diagnostic_attribute=AttributeName("validity_end_time") diagnostic_detail="An unexpected error was encountered trying to read an attribute from the configuration. This is always an error in the provider. Please report the following to the provider developer:

Missing attribute value, however no error was returned. Preventing the panic from this situation." tf_provider_addr=registry.terraform.io/hashicorp/tls @module=sdk.proto tf_proto_version=5.3 timestamp=2022-09-19T15:56:39.462-0700

From an end-user standpoint that looks something like the following:

╷
│ Error: Config Read Error
│ 
│   with tls_self_signed_cert.user,
│   on tls-example.tf line 18, in resource "tls_self_signed_cert" "user":
│   18: resource "tls_self_signed_cert" "user" {
│ 
│ An unexpected error was encountered trying to read an attribute from the configuration.
│ This is always an error in the provider. Please report the following to the provider
│ developer:
│ 
│ Missing attribute value, however no error was returned. Preventing the panic from this
│ situation.
╵

The following configuration seems to reproduce this with terraform apply followed by terraform destroy:

resource "tls_private_key" "user" {
  algorithm = "RSA"
}

resource "tls_self_signed_cert" "user" {
  private_key_pem = tls_private_key.user.private_key_pem

  subject {
    common_name  = "example.com"
    organization = "ACME Examples, Inc"
  }

  early_renewal_hours   = 4
  validity_period_hours = 8
  allowed_uses = [
    "key_encipherment",
    "digital_signature",
  ]
  is_ca_certificate = true
}

I've not yet diagnosed the root cause of this bug. I have to possible explanations in mind here, and so the first step will be determining which of these is true.

The TLS provider is not announcing that it supports the plan_destroy capability but Terraform Core is asking it to plan destroy anyway. This would be the most ideal situation because the fix would be localized in Terraform Core and so we can address it before shipping v1.3.0 final.
The TLS provider is announcing the plan_destroy capability even though it isn't actually capable of planning destroy. If this is true then the situation is messier because there's already at least one TLS provider release out there which announces it and so we'd likely need to change Terraform Core to ignore that incorrect announcement and use a different capability attribute to activate this feature instead. That would mean that any existing provider already shipped would never be asked to plan destroy, but later provider releases could still do so by opting in to the new capability.

The text was updated successfully, but these errors were encountered:

apparentlymart · 2022-09-19T23:59:53Z

I've confirmed that the provider does seem to be opting in to being asked to plan destroy, although it's the plugin framework doing it on the provider's behalf: https://github.com/hashicorp/terraform-plugin-framework/blob/7541ab15654b00837015180ecdb7f439e604c6cf/internal/fwserver/server_getproviderschema.go#L29-L31

I initially thought we were lacking a check in Terraform Core but it turns out it was just one layer deeper than I expected:

terraform/internal/plugin/grpc_provider.go

Lines 422 to 426 in 6706d52

    
           if r.ProposedNewState.IsNull() && !capabilities.PlanDestroy { 
        
           	resp.PlannedState = r.ProposedNewState 
        
           	resp.PlannedPrivate = r.PriorPrivate 
        
           	return resp 
        
           }

Since hashicorp/tls v4.0.2 is already released with this inconsistency in place (it announces that it supports planning destroy but it doesn't actually support planning destroy) I think we are faced with deciding between the following two options that both have annoying consequences:

We could ship with Terraform v1.3.0 as in the release candidate, accepting that anyone already using hashicorp/tls v4.0.2 will be blocked from destroying their certificates until there's a new provider version available which fixes this problem (either by not announcing that it can plan destroy or by correctly handling the destroy plan request).
We could retroactively renumber the plan_destroy capability to have a protobuf attribute number other than 1 -- and also potentially give it a new name in the schema to reduce confusion -- and thereby nullify the opt-in for any already released providers. This means that no providers already released would ever be asked to plan destroy, but once this inconsistency is addressed somehow providers can then opt in with the new capability flag instead and still get the benefit of this new feature.

Right now I find myself leaning towards option 2 because it's something that can be handled totally within this codebase and avoids creating a hazard where an existing provider release isn't compatible with a new Terraform Core release, even though technically it's the provider that is "incorrect" here. However, we'll need to verify that such a change won't impact an already-released provider that does correctly implement destroy planning and will then have that support retroactively revoked from it. My sense is that the likelihood of this is low because there hasn't yet been any stable release of Terraform Core which supports provider-planned destroy, and that any existing provider relying on it would end up treating the final v1.3.0 release as if it were a v1.2.x release; providers must already be able to handle the situation where older versions of Terraform Core don't ask at all.

jbardin · 2022-09-20T13:54:09Z

The problem here appears to have been a bug in the provider framework which has since been patched. An invalid value was being passed to the plan modifier during destroy, so any attribute access within that value would result in the above error.
A patch release of the TLS provider is pending now.

At least within the HashiCorp associated organizations, the TLS provider appears to be the only one using this combination of functionality, so the problem is hopefully not widespread enough to warrant pulling the feature altogether.

apparentlymart · 2022-09-20T17:52:06Z

The maintainers of the hashicorp/tls provider released a few hours ago version v4.0.3 which uses plugin framework v0.13.0 instead of v0.11.1. Plugin framework v0.12.0 contained the change which fixed this problem, from hashicorp/terraform-plugin-framework#475.

Because of this provider bug, hashicorp/tls v4.0.2 is known to be incompatible with Terraform v1.3 and later. Anyone who has encountered a message like the one I mentioned in the original issue comment above can upgrade to provider version v4.0.3 or later to fix the problem.

As @jbardin noted, we don't know of any other providers that have this problem, but we cannot see into the source code of privately-maintained providers and so it is possible that such a provider may exhibit a similar problem. If so, the resolution would be to upgrade your provider's dependency to at least Terraform Plugin Framework v0.12.0. If that doesn't resolve the problem, please open an issue in the Terraform Plugin Framework repository where we can investigate further.

As far as we can tell there is no change required in this repository, since Terraform Core seems to be behaving correctly and the framework bug which caused this error has now been resolved. Therefore I'm going to close this issue.

github-actions · 2022-10-21T02:33:20Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart added bug providers/protocol Potentially affecting the Providers Protocol and SDKs v1.3 Issues (primarily bugs) reported against v1.3 releases labels Sep 19, 2022

apparentlymart added this to the v1.3.0 milestone Sep 19, 2022

apparentlymart closed this as not planned Won't fix, can't repro, duplicate, stale Sep 20, 2022

github-actions bot locked as resolved and limited conversation to collaborators Oct 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hashicorp/tls provider claims it can plan destroy operations but fails when asked to plan destroy of tls_self_signed_cert #31820

hashicorp/tls provider claims it can plan destroy operations but fails when asked to plan destroy of tls_self_signed_cert #31820

apparentlymart commented Sep 19, 2022

apparentlymart commented Sep 19, 2022 •

edited

jbardin commented Sep 20, 2022

apparentlymart commented Sep 20, 2022

github-actions bot commented Oct 21, 2022

hashicorp/tls provider claims it can plan destroy operations but fails when asked to plan destroy of tls_self_signed_cert #31820

hashicorp/tls provider claims it can plan destroy operations but fails when asked to plan destroy of tls_self_signed_cert #31820

Comments

apparentlymart commented Sep 19, 2022

apparentlymart commented Sep 19, 2022 • edited

jbardin commented Sep 20, 2022

apparentlymart commented Sep 20, 2022

github-actions bot commented Oct 21, 2022

apparentlymart commented Sep 19, 2022 •

edited