r/aws_iam_role: parallel IAM requests on timeout #15967
Labels
enhancement
Requests to existing resources that expand the functionality or scope.
service/iam
Issues and PRs that pertain to the iam service.
stale
Old or inactive issues managed by automation, if no further action taken these will get closed.
Community Note
Terraform CLI and Terraform AWS Provider Version
2.66.0
Affected Resource(s)
Terraform Configuration Files
Any iam role resource
Debug Output
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
[DEBUG] [aws-sdk-go] DEBUG: Send Request iam/CreateRole failed, attempt 0/25, error RequestError: send request failed
[DEBUG] [aws-sdk-go] DEBUG: Retrying Request iam/CreateRole, attempt 1
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
[WARN] WaitForState timeout after 30s
[WARN] WaitForState starting 30s refresh grace period
[DEBUG] [aws-sdk-go] DEBUG: Send Request iam/CreateRole failed, attempt 1/25, error RequestError: send request failed
[DEBUG] [aws-sdk-go] DEBUG: Retrying Request iam/CreateRole, attempt 2
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
[ERROR] WaitForState exceeded refresh grace period
[DEBUG] [aws-sdk-go] DEBUG: Request iam/CreateRole Details:
Panic Output
Expected Behavior
IAM role creation succeeds in cases of temporary IAM timeouts
Actual Behavior
Previous iamconn.CreateRole() is still running when resource.Retry() timeout happens. In many cases this results double creation attempt, and eventually a failure in the plugin.
Error: Error creating IAM Role hello-world-ssm_role: EntityAlreadyExists: Role with name hello-world-ssm_role already exists.
status code: 409, request id: removed
on main.tf line 18, in resource "aws_iam_role" "ssm_role":
18: resource "aws_iam_role" "ssm_role"
Steps to Reproduce
terraform apply
Important Factoids
There is already a bug created to terraform plugin sdk for better timeout handling, however it is not getting any attention.
We have been running a patched version of terraform plugin sdk in production for several months with great success. However, the patch might be too crude to upstream as it just removes parts of the timeout handling that was found to be odd behaviour.
References
terraform-plugin-sdk issue: hashicorp/terraform-plugin-sdk#530
terraform-plugin-sdk patch: hashicorp/terraform-plugin-sdk#529
The text was updated successfully, but these errors were encountered: