feat(servicecatalog): Add Product Stack Asset Support #21508

imanolympic · 2022-08-08T16:49:58Z

Description

Draft pull request for product stack asset support feature, as referenced by issue (servicecatalog): ProductStack does not support Assets #20690.
This PR is dependent on changes to the S3 deployment module. The changes were included in this PR but will be isolated to a separate PR soon. This issue can be referenced here aws-s3-deployment: Transfer asset without extracting contents of .zip files #8065

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

gitpod-io · 2022-08-08T16:50:06Z

TheRealAmazonKendra

Before we can provide a meaningful review there's a few things we need from this PR.

Firstly, I would like to see these split out. You can reference each PR in the other one to let us know that they're linked, but these are two issues that need their own review, not in an intertwined way.

Secondly, we need a passing build. The feedback essentially becomes void if it's against code that doesn't work. This is failing on a small issue, but it fails before ever getting to either of the two modules it covers, so we fundamentally don't know if this works.

Lastly, when updating integration tests, please update them to use the IntegTest construct.

imanolympic · 2022-08-08T21:26:34Z

Great, I will replace the unit test accounts with the dummy account value.

The reason we need an actual account id for the integration tests is that one of our constructs (product-stack-asset-bucket) creates an s3 bucket and the name of this bucket relies on the untokenized account id. Because the account id can't be a token, it is necessary for the account id to be defined in the stack environment. The user can do this in one of two ways:

explicitly - env: { account: 'xxxxxxxxxxxxx', region: 'us-east-1' }
implicitly - env: { account: process.env.CDK_DEFAULT_ACCOUNT, region: process.env.CDK_DEFAULT_REGION }

This being said, I did try referencing the account id implicitly. Even though I set up my local credentials, the integration test stack didn't pick up the local credentials (I am not sure why) and our new construct threw the following error: CDK Account ID must be defined in the application environment. This forced me to explicitly include the account id.

I imagine even if there was a dummy account id for the integration tests it wouldn't matter being that the names of our buckets in the CFN stack template would change depending on the account defined in the environment. Moreover, CDK complained when my local credentials didn't match the account id defined in the environment, forcing me to believe that in order to run integ tests, the local account credentials must match the environment account.

For the reasons above, It seems it may not be feasible to test this new feature in integration tests. What are your thoughts?

Pull request has been modified.

…setSupport

aws-cdk-automation · 2022-08-09T03:52:25Z

AWS CodeBuild CI Report

CodeBuild project: AutoBuildv2Project1C6BFA3F-wQm2hXv2jqQv
Commit ID: c31fad1
Result: FAILED
Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

mrgrain

This is a complex change and tricky to follow without significant domain knowledge. If given the PR a first pass. But can you please share your design and considerations with us. Thank you.

mrgrain · 2022-08-09T11:17:18Z

packages/@aws-cdk/aws-s3-deployment/lib/bucket-deployment.ts

+   * If false, the file will remain zipped in the destination bucket.
+   * @default true
+   */
+  readonly unzipFile?: boolean;


#8065 discusses alternatives for deploying a zipped asset. Specifically #8065 (comment)

Have you considered alternatives and why have you chosen this approach?

Yes, a separate PR will be drafted for the s3-deployment changes that clarifies design considerations and alternatives. The feature addressed by this PR relies on these s3 changes and given that we are working against a deadline, we submitted this PR as a draft to get early feedback solely on the changes made to aws-servicecatalog. Hope this helps to clarify.

@mrgrain Just to clarify, are you asking for design considerations and alternatives for the entire PR or just for the S3 changes?

I would say the entire PR. This is a pretty big deviation from how we handle assets in the CDK. It looks like you are essentially creating a shadow assets mechanism that we will have to support going forward.

packages/@aws-cdk/aws-servicecatalog/lib/portfolio.ts

mrgrain · 2022-08-09T11:29:49Z

packages/@aws-cdk/aws-servicecatalog/lib/portfolio.ts

+      for (const accountId of this.sharedAccounts) {
+        bucket.grantRead(new iam.AccountPrincipal(accountId));
+      }


I'm not exactly sure what the purpose of this method is, but it might be easier/better to use a CompositePrincipal instead for the policy.

The goal is for the portfolio to collect a set of S3 Asset buckets and a list of shared accounts. Given that the portfolio does not have an onSynthesize method, we created an aspect that will take each bucket and grant read to all accounts. Essentially, in the scenario where the portfolio contains three asset buckets and two accounts, we would want the portfolio to iterate through each bucket and grant read permissions to the two accounts. If CompositePrincipal can eliminate that second for loop that would be great, will definitely look into it

How does this look? Hopefully it's more readable.

for (const bucket of this.assetBuckets) { const compositeAccountPrincipals = new iam.CompositePrincipal(...this.sharedAccounts.map( (accountId) => new iam.AccountPrincipal(accountId)) ); bucket.grantRead(compositeAccountPrincipals); }

I have a question here. We have the following scenario: our stack creates multiple porfolios containing one product. Each portfolio is shared with other account but sometimes the products have exaclty the same asset (reusable lambda in our case). Would that mean that for each portfolio there will be a separate bucket created?

Great question, this was actually the initial design goal - one product stack per portfolio. I considered working with the cdk bootstrap bucket but the idea of giving cross account access to a bucket containing all of the user's assets did not seem to adhere to best security practices.

Ideally, we would like to reduce the share scope as much as possible. Portfolios seemed like a great way to accomplish this until we realized the timeline of events would not allow for us to create an asset bucket in the portfolio and pass it down to the product stack? Why? Well, the product stack needs the s3 bucket containing the asset buckets at time of creation so that the product stack synthesizer is able to pass back the expected location of the asset in its public addFileAsset(_asset: cdk.FileAssetSource): cdk.FileAssetLocation method.

Portfolio constructor ProductStack constructor ProductStackSynthesizer::bind() ProductStackSynthesizer::addFileAsset() ProductStack::addFileAssetToParentSynthesizer() This is where we need the product stack to know the name of the bucket CloudFormationProductStackTemplate constructor CloudFormationProduct constructor CloudFormationProductStackTemplate::bind() Portfolio::addProduct() Portfolio::shareWithAccount() ProductStackSynthesizer::synthesize()

We could technically make the product stack asset bucket external facing and have the user create the bucket in their application stack. This would require allowing product stack props so that we can take the asset bucket as a prop and proceed from there. This approach was less favored by our team given that it would force users to interact with their product stack differently from the way they already do (if they want to use assets). Here is what this user experience would look like:

export class HelloServerlessStack extends cdk.Stack { constructor(scope: Construct, id: string, props?: cdk.StackProps) { super(scope, id, props); // Create development bucket const devBucket = new ProductStackAssetBucket(this, "ProductStackAssetBucket"); // Create a portfolio const portfolio = new sc.Portfolio(this, 'DevToolsPortfolio', { displayName: 'DevTools', providerName: 'IT', assetBucket: devBucket, }); // Create a Lambda product from a Product Stack const product = new sc.CloudFormationProduct(this, 'HelloServerless', { productName: "HelloServerless", owner: "IT", productVersions: [ { cloudFormationTemplate: sc.CloudFormationTemplate.fromProductStack( new HelloServerlessProduct(this, 'HelloServerlessProduct', {assetBucket: devBucket})), productVersionName: "v1", description: "A simple REST API backed by a hello Lambda function" }, ], });

Actually I like this approach. Would it be possible to combine Your's approach with this You introduced here, but the assetBucket property would be optional? This would be a good solution for the use case we are facing with my team.
The other one would simply build over 100 buckets (soft quota).
I imagine we could create this AssetsBucket in a separate stack that would be deployed right before the portfolio stack and this would be in our case not a security concern. Actually our current apprach uses the cdk assets bucket, so I actually understand Your's considerations about security, because we also had this discussions already.

With even more trickery, you could also return a Lazy.string() as bucketName in the Synthesizer, and copy in the the real bucket name when the ProductStack is being synthesized in the context of a Portfolio.

(Btw. Lazy.string() will lead to every ProductStack instance only being usable in the context of a single Portfolio. Not sure how desirable or how bad that is, but you might want to add validation to make sure users don't get it wrong. Alternatively, Lazy.uncachedString() might help, but needs a test to confirm that it works if that's what you want to support)

I considered working with the cdk bootstrap bucket but the idea of giving cross account access to a bucket containing all of the user's assets did not seem to adhere to best security practices.

Good call 😉. I like the approach you're taking here.

Portfolios seemed like a great way to accomplish this until we realized the timeline of events would not allow for us to create an asset bucket in the portfolio and pass it down to the product stack

In software, things are rarely as cut-and-dried as "it's not possible". Most things are possible (given an appropriate amount of work). You've identified the ideal customer experience already; now is the time to fight to see if you can achieve it.

Actually I like this approach. Would it be possible to combine Your's approach with this You introduced here, but the assetBucket property would be optional? This would be a good solution for the use case we are facing with my team. The other one would simply build over 100 buckets (soft quota). I imagine we could create this AssetsBucket in a separate stack that would be deployed right before the portfolio stack and this would be in our case not a security concern. Actually our current apprach uses the cdk assets bucket, so I actually understand Your's considerations about security, because we also had this discussions already.

Great, thanks for the feedback. We changed our implementation to support the user passing in an optional product stack asset bucket. The user will also have the ability to pass in a name for the assets bucket (otherwise, we'll generate the name for you 🙂).

mrgrain · 2022-08-09T11:46:27Z

packages/@aws-cdk/aws-servicecatalog/lib/product-stack.ts

+  public _getAssetBucket(): IBucket | undefined {
+    return this.assetBucket;
+  }


Why not make the property public readonly instead?

If we make it public readonly we get the following error hint in the _setAssetbucket method:
Cannot assign to 'assetBucket' because it is a read-only property

Making this property public readonly would force us to initialize the property in the constructor, which we are unable to do directly. We are unable to do this because, by design, we want to conditionally define the assetBucket property only in the case where the user has a product stack that uses assets. If the user has a product stack without assets, we would like for the assetBucket property to stay as undefined.

Happy to provide further clarification

Why have this property here at all? It only seems to exist here to copy the value from the synthesizer up to the portfolio.

Can we skip the middle man in some way?

Correct, the ultimate goal here is to get the asset bucket to the portfolio, so that the portfolio is able to share its buckets with any account the portfolio is shared with. For context on what we can do, take a look at: #21508 (comment)

Under the scenario where the user creates the product stack asset bucket in the application stack, we can skip the middle man by having the user pass in the product stack as a prop to each product stack AND the portfolio. However, would this provide users with an ideal experience? In the case where the user passes in the asset bucket to the portfolio and not the product stacks (or vice versa), how could we detect this?

rix0rrr · 2022-08-10T13:39:41Z

packages/@aws-cdk/aws-servicecatalog/lib/private/product-stack-synthesizer.ts

+
+    if (!this.assetBucket) {
+      const parentStack = (this.stack as ProductStack)._getParentStack();
+      this.assetBucket = new ProductStackAssetBucket(parentStack, `ProductStackAssetBucket${hashValues(this.stack.stackName)}`);


Hashing this doesn't seem necessary. Why not have the product name here?

You've picked the scope of asset buckets to be individual ProductStacks. What's the reason for picking that over, say, an asset bucket per Portfolio? Given that the entire portfolio is shared with the same groups of accounts anyway, a bucket per portfolio seems more desirable?

You've picked the scope of asset buckets to be individual ProductStacks. What's the reason for picking that over, say, an asset bucket per Portfolio? Given that the entire portfolio is shared with the same groups of accounts anyway, a bucket per portfolio seems more desirable?

Check out this comment: #21508 (comment)

Hashing this doesn't seem necessary. Why not have the product name here?

Let's take a look at the following scenario:

const product = new sc.CloudFormationProduct(this, 'HelloServerless', { productName: "HelloServerless", owner: "IT", productVersions: [ { cloudFormationTemplate: sc.CloudFormationTemplate.fromProductStack(new ProductStack1(this, 'ProductStack1'), productVersionName: "v1", description: "A simple REST API backed by a hello Lambda function" }, { cloudFormationTemplate: sc.CloudFormationTemplate.fromProductStack(new ProductStack2(this, 'ProductStack2') ), productVersionName: "v2", description: "A simple REST API backed by a goodbye Lambda function" }, ], });

If we hashed by product name, the product stack asset buckets for both ProductStack1 and ProductStack2 would share the same id and thus error out. To avoid this, we can hash by parent stack + product stack which is what this.stack.StackName will give us.

rix0rrr · 2022-08-10T13:43:09Z

packages/@aws-cdk/aws-servicecatalog/lib/private/product-stack-asset-bucket.ts

+   * @internal
+   */
+  public addAsset(asset: cdk.FileAssetSource): cdk.FileAssetLocation {
+    const assetPath = './cdk.out/' + asset.fileName;


This is making too many assumptions.

Instead, I would hand the asset to the underlying stack's StackSynthesizer, and hand the result of that to the BucketDeployment (as a Source.s3()).

packages/@aws-cdk/aws-servicecatalog/lib/private/product-stack-asset-bucket.ts

rix0rrr · 2022-08-10T13:46:51Z

packages/@aws-cdk/aws-servicecatalog/lib/cloudformation-template.ts

+
+  /**
+  * The S3 bucket containing product stack assets.
+  * @default -


Yes please? :)

packages/@aws-cdk/aws-servicecatalog/lib/portfolio.ts

rix0rrr · 2022-08-10T13:57:47Z

packages/@aws-cdk/aws-servicecatalog/lib/product-stack.ts

+  public _getAssetBucket(): IBucket | undefined {
+    return this.assetBucket;
+  }


Why have this property here at all? It only seems to exist here to copy the value from the synthesizer up to the portfolio.

Can we skip the middle man in some way?

corymhall

Couple of questions:

Should there be a bucket per region? Like we do with cdk-assets?
What about encryption on buckets?
Should we think about garbage collection of old assets? If we eventually solve this for cdk-assets it seems like it will have to be solved here separately.

corymhall · 2022-08-10T14:52:43Z

packages/@aws-cdk/aws-servicecatalog/lib/private/product-stack-asset-bucket.ts

+    if (cdk.Token.isUnresolved(accountId)) {
+      throw new Error('CDK Account ID must be defined in the application environment');
+    }
+    return `product-stack-asset-bucket-${accountId}-${hashValues(id)}`;


Can we use the cloudformation intrinsic ${AWS::AccountId} instead? Do we need to know the bucketname at synth time?

Do you mind clarifying your first question? My understanding is that buckets share a global namespace and thus, the concept of regions doesn't necessary apply.

As for garbage collection, the great thing about having a deterministic name for the bucket is that the user will be forced to delete the buckets in the case where they destroy and try to re deploy their stacks. This way, users will avoid the scenario of having s3 buckets pile up. This being said, the only garbage collection concern is one where a user destroys their stack (with the intention of never deploying it again) and forgets to delete the asset buckets on S3.

Can we use the cloudformation intrinsic ${AWS::AccountId} instead? Do we need to know the bucketname at synth time?

Yes, we are required to know the bucket name at synth time given that product stack synthesizer is required to return an expected location for the asset, and the entire appeal of product stacks is that they altogether skip deployment (they simply synthesize a template that gets uploaded to the CDK bootstrap bucket of the user).

This being said, we tried using ${AWS::AccountId} but the issue with doing so is that it contains a token rather than the explicit account id string. At deploy time, the token is resolved and the bucket is created. The issue comes before that, when the product stack synthesizer tries to return the expected asset location. Instead of returning product-stack-asset-bucket-1234 for the bucket name, it will return product-stack-asset-bucket-${TOKEN[Bucket.Name.1234]}

Do you mind clarifying your first question? My understanding is that buckets share a global namespace and thus, the concept of regions doesn't necessary apply.

A global namespace does not mean that it is a global service. If I publish the assets to a bucket in us-east-1, but I consume the assets in a stack deployment in us-east-2 I've created a cross region dependency. What if us-east-1 is experiencing an outage and I need to deploy my failover stack in us-east-2? I won't be able to deploy because I have a dependency on us-east-1.

As for garbage collection...

Maybe I misunderstand how this feature works. I'm more talking about garbage collection for the assets themselves. When I change the lambda code and deploy, does the new asset go to the same bucket? Overtime this might lead to a lot of outdated assets.

padaszewski · 2022-08-27T13:11:25Z

Hi @imanolympic
Is there any progress on this one?

imanolympic · 2022-08-27T15:53:43Z

Hi @imanolympic Is there any progress on this one?

Yes, we have split this PR into two. The S3 PR will be out this week, and the SC PR will follow promptly. On a good note, we managed to find a way to support both the user passing in a product stack asset bucket and SC creating it in the background.

aws-cdk-automation · 2022-08-30T17:27:18Z

This PR has been in the BUILD FAILING state for 3 weeks, and looks abandoned. To keep this PR from being closed, please continue work on it. If not, it will automatically be closed in a week.

aws-cdk-automation · 2022-09-06T20:14:06Z

This PR has been deemed to be abandoned, and will be automatically closed. Please create a new PR for these changes if you think this decision has been made in error.

feat(servicecatalog): Add Product Stack Asset Support

dcc5500

github-actions bot added the p2 label Aug 8, 2022

aws-cdk-automation requested a review from a team August 8, 2022 16:50

TheRealAmazonKendra previously requested changes Aug 8, 2022

View reviewed changes

Update integ.product.ts

434fed8

aws deleted a comment from imanolympic Aug 8, 2022

aws deleted a comment from aws-cdk-automation Aug 8, 2022

Theron Mansilla added 2 commits August 8, 2022 23:42

feat(servicecatalog): Add Product Stack Asset Support

2bbd126

Merge branch 'AssetSupport' of github.com:imanolympic/aws-cdk into As…

c31fad1

…setSupport

mrgrain requested changes Aug 9, 2022

View reviewed changes

rix0rrr requested changes Aug 10, 2022

View reviewed changes

corymhall suggested changes Aug 10, 2022

View reviewed changes

wanjacki mentioned this pull request Sep 5, 2022

aws-s3-deployment: Transfer asset without extracting contents of .zip files #8065

Closed

2 tasks

aws-cdk-automation added the closed-for-staleness This issue was automatically closed because it hadn't received any attention in a while. label Sep 6, 2022

aws-cdk-automation closed this Sep 6, 2022

feat(servicecatalog): Add Product Stack Asset Support #21508

feat(servicecatalog): Add Product Stack Asset Support #21508

Conversation

imanolympic commented Aug 8, 2022 • edited

Description

gitpod-io bot commented Aug 8, 2022

TheRealAmazonKendra left a comment

Choose a reason for hiding this comment

imanolympic commented Aug 8, 2022 • edited by TheRealAmazonKendra

aws-cdk-automation commented Aug 9, 2022

AWS CodeBuild CI Report

mrgrain left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

imanolympic Aug 9, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

corymhall left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

padaszewski commented Aug 27, 2022

imanolympic commented Aug 27, 2022

aws-cdk-automation commented Aug 30, 2022

aws-cdk-automation commented Sep 6, 2022

imanolympic commented Aug 8, 2022 •

edited

imanolympic commented Aug 8, 2022 •

edited by TheRealAmazonKendra

imanolympic Aug 9, 2022 •

edited