Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Subtle issue representing null as unset value, resulting in deprecation warning #31730

Closed
lorengordon opened this issue Sep 1, 2022 · 20 comments · Fixed by aws-ia/terraform-aws-mwaa#39
Labels
bug v1.2 Issues (primarily bugs) reported against v1.2 releases

Comments

@lorengordon
Copy link
Contributor

Terraform Version

❯ terraform version
Terraform v1.2.8
on linux_amd64
+ provider registry.terraform.io/hashicorp/aws v4.28.0


### Terraform Configuration Files

```terraform
variable "enable_classiclink" {
  type    = bool
  default = null
}

resource "aws_vpc" "main" {
  cidr_block         = "10.0.0.0/16"
  enable_classiclink = var.enable_classiclink
}

Debug Output

not comfortable posting this

Expected Behavior

No deprecation warning should be displayed when the input is null, since null should be treated the same as unset.

Actual Behavior

Received the deprecation warning:

│ Warning: Argument is deprecated
│
│   with aws_vpc.main,
│   on main.tf line 8, in resource "aws_vpc" "main":
│    8:   enable_classiclink = var.enable_classiclink
│
│ With the retirement of EC2-Classic the enable_classiclink attribute has been deprecated and will be removed in a future version.

Steps to Reproduce

  1. terraform init
  2. terraform plan

Additional Context

Slack convo in #hangops: https://hangops.slack.com/archives/C0Z93TPFX/p1662060793011949?thread_ts=1662056311.323019&cid=C0Z93TPFX

References

No response

@lorengordon lorengordon added bug new new issue not yet triaged labels Sep 1, 2022
@lorengordon
Copy link
Contributor Author

If you set the option to null directly on the resource attribute, then there is no deprecation warning:

resource "aws_vpc" "main" {
  cidr_block         = "10.0.0.0/16"
  enable_classiclink = null
}

@lorengordon
Copy link
Contributor Author

Attempting to conditionally set enable_classiclink fails to work around it, and still generates the deprecation warning:

resource "aws_vpc" "main" {
  cidr_block         = "10.0.0.0/16"
  enable_classiclink = var.enable_classiclink == true ? true : null
}

@lorengordon
Copy link
Contributor Author

This also fails to avoid the deprecation warning:

variable "enable_classiclink" {
  type     = bool
  default  = false
  nullable = false
}

resource "aws_vpc" "main" {
  cidr_block         = "10.0.0.0/16"
  enable_classiclink = var.enable_classiclink ? true : null
}

@jbardin
Copy link
Member

jbardin commented Sep 1, 2022

Hi @lorengordon,

Thanks for reporting the issue. As you can see from the example where null is assigned directly, it's not likely a problem with the fact there is an assignment at all, rather the issue is that the value is coming from a variable which may not be known at the point of validation. This means the problem is on the provider side, rather than within Terraform core. The fact that there is a deprecation message with the warning also indicates the provider is creating the diagnostic output, as that text is not available within Terraform.

To be more specific, the problem lies in the legacy SDK, which is treating the fact that an unknown value is assigned to the attribute as a sign that the attribute is intended to be used. This is likely a fair assumption in most cases, since the warning is there to indicate that the attribute should not be used in the configuration at all. Do you have an example where assigning something to a deprecated field should not issue a warning, or where that warning is causing a problem?

@jbardin jbardin added waiting-response An issue/pull request is waiting for a response from the community and removed new new issue not yet triaged labels Sep 1, 2022
@lorengordon
Copy link
Contributor Author

I hate to at people, but I was mostly opening the issue at the prompting of @apparentlymart, and the discussion from the slack thread I linked.

I'm rather confused with the explanation, as there are no dynamic or unknown values here. The null assignment is entirely known. Why would a known null value be different when coming from a variable default or user input, vs when assigned directly to a resource attribute in the config?

It seems like these two things should behave exactly the same, when no variables are passed to the first config:

variable "enable_classiclink" {
  type    = bool
  default = null
}

resource "aws_vpc" "main" {
  cidr_block         = "10.0.0.0/16"
  enable_classiclink = var.enable_classiclink
}
resource "aws_vpc" "main" {
  cidr_block         = "10.0.0.0/16"
  enable_classiclink = null
}

@jbardin
Copy link
Member

jbardin commented Sep 1, 2022

OK, I see the confusion now. Config validation is done statically, which means variables are inherently unknown at that point since nothing has yet been evaluated.

Most validation is done within Terraform core, but deprecated attributes are not handled within Terraform, those are determined by the provider, so in this case the provider can only see that the attribute has been set in the config when the unknown variable is assigned. I would say the irregularity here is that we don't currently warn about the deprecated field assignment when null is assigned directly.

@lorengordon
Copy link
Contributor Author

lorengordon commented Sep 1, 2022

Gotcha, that makes more sense. It would certainly be rather nice to be able to avoid the deprecation warning just by passing null to the argument, but having consistent behavior would help reduce confusion either way, and make it more clear what users need to do here.

@crw crw removed the waiting-response An issue/pull request is waiting for a response from the community label Sep 2, 2022
@apparentlymart
Copy link
Member

apparentlymart commented Sep 2, 2022

Thanks for reporting this @lorengordon, and thanks @jbardin for looking into the details.

When we were discussing it (in HangOps Slack) I knew something wasn't right here but I wasn't sure what, so I asked Loren to open this issue so we would have a chance to look into it. Unfortunately I've been away from code most of the afternoon so I wasn't able to look deeper immediately, but I'm glad to see the confirmation that the deprecation warning logic does indeed still live inside the provider, since I thought that was true but wasn't sure if I'd forgotten a change of responsibilities at some point.

With this explanation, it seems like what we have here is really a bug in the Terraform SDK, rather than in Terraform Core: the usual rule for validation is that you skip validation if you receive an unknown value, and assume that you'll get an opportunity to validate some more in a later phase when Terraform Core has more information. The SDK here seems to be checking the deprecation warning even if the value is unknown, which causes it to report a false positive. If it had instead skipped that validation as expected, then the provider would've had an opportunity to catch it during the planning phase instead, once that input variable becomes a known null.

Unfortunately at first glance the SDK code seems to disprove that theory: https://github.com/hashicorp/terraform-plugin-sdk/blob/fa8d313665945816f6eb6c79182e4abdc1540504/helper/schema/schema.go#L1737-L1747

We can see it there first checking whether the value is wholly known, exactly as I would've expected. Seems like something more subtle is going on. Hopefully we can dig a little deeper to confirm it's the SDK doing this and then, if so, understand a bit more about why before forwarding this bug on to the SDK team. 😖

@kmoe
Copy link
Member

kmoe commented Sep 2, 2022

There are a couple of things going on here in the SDK.

Firstly, in the code posted by @apparentlymart, the conditional is the wrong way around. Currently it says that if the value is not wholly known, then the deprecation warning should be shown. This mistake seems to be due to a series of refactors over the years that changed the purpose of the conditional.

Secondly, even when this is corrected, later code at https://github.com/hashicorp/terraform-plugin-sdk/blob/fa8d313665945816f6eb6c79182e4abdc1540504/helper/schema/schema.go#L2352 generates the same deprecation warning diag without the unknown check.

The deprecation warning is therefore printed whether or not the value is unknown at time of validate.

hashicorp/terraform-plugin-sdk#1047 fixes this in the SDK. What does give me some pause here is that the test case in the SDK for "Unknown + Deprecation" explicitly specifies that the warning diag be generated. I don't know whether this was intentional, but if so, then it's possible the SDK does not intend to abide by the the usual rule for validation, that you skip validation if you receive an unknown value.


Minimal repro example

Version 3.3.0 of terraform-provider-random uses the legacy SDK and has a deprecated attribute.

Config

terraform {
  required_providers {
    random = {
      version = "3.3.0"
    }
  }
}

variable "password_num" {
  type = bool
  default = null
}

resource "random_password" "foo" {
  number = var.password_num
  length = 10
}

bflad added a commit to hashicorp/terraform-plugin-framework that referenced this issue Sep 2, 2022
…own values

Reference: hashicorp/terraform-plugin-sdk#1047
Reference: hashicorp/terraform#31730

This change is made to keep terraform-plugin-sdk and terraform-plugin-framework behaviors for deprecation warning handling in sync.
@jbardin
Copy link
Member

jbardin commented Sep 2, 2022

The idea of a deprecation warning in this context is not only to communicate that the attribute won't be used in the future by the provider, but also to indicate that having that attribute in the configuration will be invalid after a future release. Without the deprecation warning a user is more likely to be surprised after an update that the configuration is no longer valid at all.

If we are going to remove this check from the SDKs, perhaps it's a good time to more completely move the validation into core, where we can actually detect assignment vs usage.

@bryantbiggs
Copy link

The idea of a deprecation warning in this context is not only to communicate that the attribute won't be used in the future by the provider, but also to indicate that having that attribute in the configuration will be invalid after a future release.

Wouldn't this fall under a breaking change in the provider when the attribute is fully removed? Its not common for breaking changes in the providers to show up as warnings, but maybe I don't fully understand when something is slated to show a warning and when something is just intended to be handled during a breaking change docs notice.

@ewbankkit
Copy link
Contributor

The Terraform AWS Provider follows the documented best practices for attribute removal, so the removal of enable_classiclink will take place, as a documented breaking change, in a future (most likely the next) major version of the provider.

@apparentlymart
Copy link
Member

apparentlymart commented Sep 6, 2022

I think an important subtlety here is that this particular argument should only be unknown during validation. Once we get to the planning phase it will be known and so the provider will get a chance to validated it again and can return the warning there. That means that terraform validate alone would not raise it, but terraform plan (which includes an implicit validation call with more information) would raise it.

Although it isn't true in this case, in some cases a value isn't even known during planning. In that case, the provider gets its final chance to validate the object during the apply phase, and at that point all values are guaranteed to be known and so the warning definitely will get issued at that point.

So with that said: the rule is about returning warnings at the earliest stage when we have enough information to decide that they are applicable, not about skipping showing warnings altogether. Terraform will always try to show an error or warning at as early a stage as possible, but will return them during the apply phase as a last resort if it wasn't possible to catch them earlier.

It is of course true that if the argument is finally removed from the provider in a later release then stating it at all will become invalid at that point, even if set dynamically to null. It's an interesting tradeoff whether it's better to be noisy about it early so that shared module authors can remove the possibility to even set this argument; I can see pros and cons on both sides here. It's particularly annoying though when the module in question is a third-party module that the user doesn't directly control... in this case we'd presumably like to tell the module author that their module will be incompatible with a future version of the provider, but it's less helpful to bother every user of the module about it because they typically aren't empowered to respond to the warning anyway. It also means that the shared module must make a breaking change in advance of the provider making the breaking change in order to quiet the warning, which is... not ideal, I think. 🤔

@bryantbiggs
Copy link

bryantbiggs commented Sep 6, 2022

but it's less helpful to bother every user of the module about it because they typically aren't empowered to respond to the warning anyway. It also means that the shared module must make a breaking change in advance of the provider making the breaking change in order to quiet the warning

This is essential the crux of the issue. Users do not full understand what is controlled at the module level vs the provider level vs Terraform core, but the module is typically what they interact with directly. Users come to the module to voice their displeasure with the warning which then forces the author into a situation which you accurately described

  • Do we make a (technically) breaking change now and another later, so two breaking changes for the same, innocuous change (which, mind you, at the end of the day is just a warning in output - you cannot create a new resources under EC2-Classic)
  • Do we wait for the provider to make the change (will be some time till AWS fully removes) and force users into a larger breaking change (new major provider version which most likely brings in other breaking changes)
  • Do we knowingly break with convention to save the masses and keep users happy?

For this round, I went with the last option because that was the lesser of the evils this time - but next time the choice might not be so easy.

@apparentlymart
Copy link
Member

This is a tricky situation where we have a specific example of a problem at hand but any change we make to address it must necessarily be a general change that would affect all providers, because the deprecation mechanism belongs to the SDK rather than to the provider's own logic.

My personal take here is that the appropriate compromise is to change this validation rule to behave the same as all other validation rules (skipping emitting any errors or warnings for the unknown value) even though we acknowledge that this means that an author of a module like the one we're discussing here would not see the warning unless they explicitly test using a non-null enable_classiclink.

Module authors are already burdened with the expectation that they will test all of the different permutations of input variables that their module considers to be valid, and over time we will hopefully improve that situation with tooling to help automate such testing. If an author of this module tests with enable_classiclink set to either true or false, rather than to null, then they will see the warning as intended. If they don't test it, then they will ultimately find out when a subsequent major release of the provider finally removes the argument, but might then reasonably conclude that the input variable wasn't being widely used anyway and therefore isn't a big deal to remove.

End-users of a module do not need (or want) to see warnings or errors related to parts of a module they are not currently using, and more generally I think do not want to see warnings that they cannot directly act upon. In this specific case, that means that only users who have explicitly used the enable_classiclink variable would see the warnings related to it, and they can manage their own migration away from this deprecated feature by treating it as if it were a deprecated feature of the module itself rather than of the underlying provider, and change their module block to no longer set the argument.

However, this is only my perspective on it and I'd in particular be interested to hear from any authors of shared modules who feel concerned that they would not have seen the subsequent removal of the ClassicLink features coming without this warning and would therefore be upset if Terraform had not warned about it in this way. I do acknowledge that right now the automatic tooling for testing different settings of a module are lacking and that in particular "EC2-Classic" has not been available for some AWS accounts for some time and so some developers may not even be able to test this mechanism if they don't have a sufficiently old AWS account.

bflad added a commit to hashicorp/terraform-plugin-framework that referenced this issue Sep 6, 2022
…own values (#465)

Reference: hashicorp/terraform-plugin-sdk#1047
Reference: hashicorp/terraform#31730

This change is made to keep terraform-plugin-sdk and terraform-plugin-framework behaviors for deprecation warning handling in sync.
@lorengordon
Copy link
Contributor Author

End-users of a module do not need (or want) to see warnings or errors related to parts of a module they are not currently using, and more generally I think do not want to see warnings that they cannot directly act upon. In this specific case, that means that only users who have explicitly used the enable_classiclink variable would see the warnings related to it, and they can manage their own migration away from this deprecated feature by treating it as if it were a deprecated feature of the module itself rather than of the underlying provider, and change their module block to no longer set the argument.

That reasoning makes a lot of sense to me. If I'm not using that part of the module, then as a user I don't need to see the deprecation.

@bflad
Copy link
Member

bflad commented Sep 6, 2022

terraform-plugin-sdk v2.22.0 and terraform-plugin-framework v0.12.0, when released, will skip the deprecation warning diagnostic on unknown values. Practitioners will receive this behavior when providers with deprecated attributes update to those sdk/framework releases, cut their own release, and practitioners upgrade to that provider release with the sdk/framework dependency update.

@bryantbiggs
Copy link

thank you all - I really appreciate it!

@apparentlymart apparentlymart added the v1.2 Issues (primarily bugs) reported against v1.2 releases label Sep 16, 2022
@apparentlymart
Copy link
Member

From @bflad's comment it seems that there isn't really anything left to do in this repository in response to this issue.

Because each individual provider team will need to separately upgrade to a newer SDK version and make a release before this will take effect I cannot promise any particular time this will be addressed in general, but I can at least see that the latest release of hashicorp/aws depends on SDK v2.22.0 and should therefore have the aforementioned fix, and thus the specific argument we were discussing in this issue should at least be fixed after upgrading to latest.

Since we use issues in this repository only to track changes to the code in this repository, I'm going to close this issue. If you are seeing something similar to this problem in another provider then I suggest opening an issue in the repository for that provider. If you do so, please include a link to this issue so that the provider development team can see the potential solution of upgrading the SDK or framework dependency. Thanks!

@apparentlymart apparentlymart closed this as not planned Won't fix, can't repro, duplicate, stale Sep 28, 2022
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 28, 2022
maiconrocha added a commit to maiconrocha/terraform-aws-mwaa-1 that referenced this issue Sep 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug v1.2 Issues (primarily bugs) reported against v1.2 releases
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants