Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(custom-resources): inactive user function throws error that breaks provider framework #21095

Closed
wants to merge 1 commit into from

Conversation

comcalvi
Copy link
Contributor

@comcalvi comcalvi commented Jul 11, 2022

Lambda functions can go inactive after some time. Once inactive, lambda must reprovision resources for it, causing invocations to error out until the resources are available. This adds logic to retry the inactive function until it is ready to execute again.

Not merging until confirmed via manual testing that this fixes the issue (have to wait for a lambda to become inactive).

Closes #20123.


All Submissions:

Adding new Unconventional Dependencies:

  • This PR adds new unconventional dependencies following the process described here

New Features

  • Have you added the new feature to an integration test?
    • Did you use yarn integ to deploy the infrastructure and generate the snapshot (i.e. yarn integ without --dry-run)?

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

@gitpod-io
Copy link

gitpod-io bot commented Jul 11, 2022

@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Jul 11, 2022
@github-actions github-actions bot added bug This issue is a bug. effort/small Small work item – less than a day of effort p1 labels Jul 11, 2022
@comcalvi comcalvi added the pr/do-not-merge This PR should not be merged at this time. label Jul 11, 2022
@aws-cdk-automation aws-cdk-automation requested a review from a team July 11, 2022 20:48
@comcalvi comcalvi added pr-linter/exempt-test The PR linter will not require test changes pr-linter/exempt-integ-test The PR linter will not require integ test changes labels Jul 11, 2022
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
  • Commit ID: 85bf5ae
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@@ -112,6 +120,23 @@ async function invokeUserFunction<A extends { ResponseURL: '...' }>(functionArnE

const jsonPayload = parseJsonPayload(resp.Payload);
if (resp.FunctionError) {
if (resp.FunctionError.includes('Lambda is initializing your function')) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know that this part of the error message isn't used in other errors? It wouldn't be a great experience if this caught other errors as well and then the customer would have to wait ~10 minutes to find out that their deployment failed. I think it would be better, on an error, to check the function state per Lambda Function States. If the function is inactive or pending, we can retry and/or look at StateReasonCode to further narrow the retry criteria. If it's active or failed, we return the terminal result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 This is a much better solution to this than what I currently have, I will add this logic.

@@ -13,6 +13,14 @@ export = {
[consts.FRAMEWORK_ON_TIMEOUT_HANDLER_NAME]: onTimeout,
};

const DEFAULT_DELAY = 10_000;
const MAX_TOTAL_DELAY = 620_000;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious how you came to a little over ten minutes a the desired total delay.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lambdas run for 15, and I figured the function could conceivably do anything after this code runs, so it makes sense to have some headroom; 5 minutes seemed to be a decent amount of time. Not the most rigorous process, this can be changed if needed.

@comcalvi
Copy link
Contributor Author

This was mistakenly opened, #20922 is the correct one.

@comcalvi comcalvi closed this Jul 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. contribution/core This is a PR that came from AWS. effort/small Small work item – less than a day of effort p1 pr/do-not-merge This PR should not be merged at this time. pr-linter/exempt-integ-test The PR linter will not require integ test changes pr-linter/exempt-test The PR linter will not require test changes
Projects
None yet
3 participants