New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(custom-resources): inactive user function throws error that breaks provider framework #21095
Conversation
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
@@ -112,6 +120,23 @@ async function invokeUserFunction<A extends { ResponseURL: '...' }>(functionArnE | |||
|
|||
const jsonPayload = parseJsonPayload(resp.Payload); | |||
if (resp.FunctionError) { | |||
if (resp.FunctionError.includes('Lambda is initializing your function')) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we know that this part of the error message isn't used in other errors? It wouldn't be a great experience if this caught other errors as well and then the customer would have to wait ~10 minutes to find out that their deployment failed. I think it would be better, on an error, to check the function state per Lambda Function States. If the function is inactive or pending, we can retry and/or look at StateReasonCode
to further narrow the retry criteria. If it's active or failed, we return the terminal result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 This is a much better solution to this than what I currently have, I will add this logic.
@@ -13,6 +13,14 @@ export = { | |||
[consts.FRAMEWORK_ON_TIMEOUT_HANDLER_NAME]: onTimeout, | |||
}; | |||
|
|||
const DEFAULT_DELAY = 10_000; | |||
const MAX_TOTAL_DELAY = 620_000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious how you came to a little over ten minutes a the desired total delay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lambdas run for 15, and I figured the function could conceivably do anything after this code runs, so it makes sense to have some headroom; 5 minutes seemed to be a decent amount of time. Not the most rigorous process, this can be changed if needed.
This was mistakenly opened, #20922 is the correct one. |
Lambda functions can go inactive after some time. Once inactive, lambda must reprovision resources for it, causing invocations to error out until the resources are available. This adds logic to retry the inactive function until it is ready to execute again.
Not merging until confirmed via manual testing that this fixes the issue (have to wait for a lambda to become inactive).
Closes #20123.
All Submissions:
Adding new Unconventional Dependencies:
New Features
yarn integ
to deploy the infrastructure and generate the snapshot (i.e.yarn integ
without--dry-run
)?By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license