Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce File Size of Providers SDKs #11668

Open
7 tasks
RobbieMcKinstry opened this issue Dec 15, 2022 · 2 comments
Open
7 tasks

Reduce File Size of Providers SDKs #11668

RobbieMcKinstry opened this issue Dec 15, 2022 · 2 comments
Labels
area/codegen SDK-gen, program-gen, convert impact/performance Something is slower than expected kind/engineering Work that is not visible to an external user language/dotnet language/go language/java language/python language/yaml size/L Estimated effort to complete (up to 10 days).

Comments

@RobbieMcKinstry
Copy link
Contributor

RobbieMcKinstry commented Dec 15, 2022

Hello!

  • Vote on this issue by adding a 馃憤 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

In our various languages, generated code can create very large files. Large file sizes can crush IDEs. When IntelliJ or another editor opens a file, it usually opens and buffers the entire file and launches some kind of LSP to show possible errors. This can consume a lot of memory and be very slow, both for responsiveness and for auto-completion.

There are two ways to think about this issue:

  1. Can we target the worst offenders directly?
  2. Can we put a max file size on generated code?

Lastly, we need to consider which languages are the biggest offenders and why.

  • It's well known that NodeJS has a problem with the types directory, which we're addressing. See the original issue and the tracking issue for the fix.
  • There are user reports that Go has a similar problem.
  • It's unclear if this is a problem for Dotnet, Python, Java, or YAML.

Each language has a unique directory structure, so the reason for large files is different for each language.

We should set a target SLO of 20MB max for file size. (based on customer discussion)

Discussion

Targetting the Worst Offenders: IMO we should carry on with the approach we're using for NodeJS. In Node, we found an instance where files are massive . In #8613 we identified that the types directory is 77MB, and types/input.ts is 30MB while types/output.ts is 37MB. That's unacceptably large. We decided to split these files into smaller files. Using a targetted approach here allows us to be more strategic about how we fix these performance bugs.

This is probably easy in Go, since we can split files arbitrarily within a package without affecting scope. Java and Dotnet might be the same, although we likely need to add extra import statements, which could be a challenge. Node may very well not be an issue.

Max File Size: I'm not too fond of this design. It would be challenging to generate meaningful names automatically. If we split along an arbitrary boundary, then we need some way to name both files. I'm not sure this is really a meaningful split. Splitting arbitrarily makes it harder to inspect the source since there's no rhyme or reason to the filesystem layout.

Action Items

We need to gather intel about which files are the biggest and why. Then, we can decide if we can be strategic or not.

Progress Tracker

  • Identify the biggest files...
    • ...in Go
    • ...in Python
    • ...in Dotnet
    • ...in Java
    • ...in YAML
  • Determine why those files are so big. This lets us decide if we can strategically target the worst offenders or if we need to do something more complex and general purpose.

Affected area/feature

Codegen for NodeJS, Dotnet, Go, Python, Java, and YAML.

@RobbieMcKinstry RobbieMcKinstry added impact/performance Something is slower than expected language/python language/go language/dotnet area/codegen SDK-gen, program-gen, convert kind/engineering Work that is not visible to an external user size/L Estimated effort to complete (up to 10 days). language/yaml language/java labels Dec 15, 2022
@RobbieMcKinstry
Copy link
Contributor Author

Looking at Azure Native locally, there are no files in the Go SDK which are larger than 20MB, which does not align with the customer report.

@lblackstone
Copy link
Member

lblackstone commented Dec 15, 2022

I also checked the pulumi-kubernetes repo, and all of the Go SDK files were < 5 MB.

(First column is size in MB)

$ find sdk/go -type f -exec du -am {} + | sort -n -r | head -n 20
4	sdk/go/kubernetes/core/v1/pulumiTypes.go
1	sdk/go/kubernetes/yaml/yaml.go
1	sdk/go/kubernetes/yaml/transformation.go
1	sdk/go/kubernetes/yaml/configGroup.go
1	sdk/go/kubernetes/yaml/configFile.go
1	sdk/go/kubernetes/storage/v1beta1/volumeAttachmentPatch.go
1	sdk/go/kubernetes/storage/v1beta1/volumeAttachmentList.go
1	sdk/go/kubernetes/storage/v1beta1/volumeAttachment.go
1	sdk/go/kubernetes/storage/v1beta1/storageClassPatch.go
1	sdk/go/kubernetes/storage/v1beta1/storageClassList.go
1	sdk/go/kubernetes/storage/v1beta1/storageClass.go
1	sdk/go/kubernetes/storage/v1beta1/pulumiTypes.go
1	sdk/go/kubernetes/storage/v1beta1/init.go
1	sdk/go/kubernetes/storage/v1beta1/csistorageCapacityPatch.go
1	sdk/go/kubernetes/storage/v1beta1/csistorageCapacityList.go
1	sdk/go/kubernetes/storage/v1beta1/csistorageCapacity.go
1	sdk/go/kubernetes/storage/v1beta1/csinodePatch.go
1	sdk/go/kubernetes/storage/v1beta1/csinodeList.go
1	sdk/go/kubernetes/storage/v1beta1/csinode.go
1	sdk/go/kubernetes/storage/v1beta1/csidriverPatch.go

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/codegen SDK-gen, program-gen, convert impact/performance Something is slower than expected kind/engineering Work that is not visible to an external user language/dotnet language/go language/java language/python language/yaml size/L Estimated effort to complete (up to 10 days).
Projects
None yet
Development

No branches or pull requests

2 participants