Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs(install): instructions to reduce pydantic package size #1077

Merged
merged 2 commits into from Apr 12, 2022
Merged

docs(install): instructions to reduce pydantic package size #1077

merged 2 commits into from Apr 12, 2022

Conversation

Dunedan
Copy link
Contributor

@Dunedan Dunedan commented Mar 16, 2022

Issue, if available: #1078

Pydantic can be installed without binary files. This significantly
reduces the additional size it adds to the compressed Lambda packages
(down to 2MB from 25MB) at the expense of 30%-50% of its performance.

While at it I also updated the amount of megabyte Pydantic adds to the
package, as it wasn't clear if the stated number referred to compressed
or uncompressed package size and I couldn't reproduce the 75MB either
way.

Checklist

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.


View rendered docs/utilities/parser.md

Pydantic can be installed without binary files. This significantly
reduces the additional size it adds to the compressed Lambda packages
(down to 2MB from 25MB) at the expense of 30%-50% of its performance.

While at it I also updated the amount of megabyte Pydantic adds to the
package, as it wasn't clear if the stated number referred to compressed
or uncompressed package size and I couldn't reproduce the 75MB either
way.
@pull-request-size pull-request-size bot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Mar 16, 2022
@boring-cyborg boring-cyborg bot added the documentation Improvements or additions to documentation label Mar 16, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 16, 2022

Thanks a lot for your first contribution! Please check out our contributing guidelines and don't hesitate to ask whatever you need.

@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2022

Codecov Report

Merging #1077 (fe98327) into develop (1394d00) will increase coverage by 0.00%.
The diff coverage is n/a.

@@           Coverage Diff            @@
##           develop    #1077   +/-   ##
========================================
  Coverage    99.96%   99.96%           
========================================
  Files          119      119           
  Lines         5373     5376    +3     
  Branches       613      613           
========================================
+ Hits          5371     5374    +3     
  Partials         2        2           
Impacted Files Coverage Δ
aws_lambda_powertools/logging/utils.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1394d00...fe98327. Read the comment docs.

@heitorlessa
Copy link
Contributor

Hey @Dunedan thanks a lot for this! I've just edited your description to refer to an existing issue about the new option - the MB size was uncompressed when we tried back when Parser feature was added (+1 year).

We're currently prioritizing bugs only given our bandwidth, so we might have delays to review, reproduce, and get this merged.

@am29d it'd be great to have this tested and automatically included in our Lambda Layer and SAR App so customers don't have to.

@am29d
Copy link
Contributor

am29d commented Mar 24, 2022

Hey @Dunedan thanks for the contribution! I have compared both installations uncompressed, and the difference is 9,8MB, which is more than 10% packages size improvement (88MB vs 79MB).

Overall I was surprised that our package size for extras went down to 88MB. The docs still says 22mb (compressed) 155mb (uncompressed). Can we also update this detail to 12MB and 79MB?

@Dunedan
Copy link
Contributor Author

Dunedan commented Mar 24, 2022

@am29d What's the "installations" you compare? I believe I do something different, as the sizes I get are completely different ones:

$ python3.10 -mvenv empty-venv
$ zip -qr empty-venv.zip empty-venv/
$
$ python3.10 -mvenv powertools-wo-pydantic
$ ./powertools-wo-pydantic/bin/pip install -q aws-lambda-powertools
$ zip -qr powertools-wo-pydantic.zip powertools-wo-pydantic/
$
$ python3.10 -mvenv powertools-with-pydantic
$ ./powertools-with-pydantic/bin/pip install -q aws-lambda-powertools[pydantic]
$ zip -qr powertools-with-pydantic.zip powertools-with-pydantic/
$
$ python3.10 -mvenv powertools-with-pydantic-wo-binary
$ SKIP_CYTHON=1 ./powertools-with-pydantic-wo-binary/bin/pip install -q --no-binary pydantic aws-lambda-powertools[pydantic]
$ zip -qr powertools-with-pydantic-wo-binary.zip powertools-with-pydantic-wo-binary/
$
$ du -h --max-depth=1 .
19M     ./empty-venv
98M     ./powertools-wo-pydantic
143M    ./powertools-with-pydantic
102M    ./powertools-with-pydantic-wo-binary
532M    .
$
$ du -h *.zip
19M     empty-venv.zip
43M     powertools-wo-pydantic.zip
68M     powertools-with-pydantic.zip
45M     powertools-with-pydantic-wo-binary.zip

@am29d
Copy link
Contributor

am29d commented Mar 24, 2022

Interesting, I use -t target param to install the package and the dependencies into the dedicated folder. We use the same method to package the layer in SAR, public layer, and the cdk construct. It looks something like:

pip install -t powertools aws-lambda-powertools

Here is the result of the folders and zip files:

➜ du -h -d=1 powertools*
 88M	powertools-with-pydantic
 79M	powertools-with-pydantic-wo-binaries
 75M	powertools-wo-pydantic
 16M	powertools-with-pydantic.zip
 13M	powertools-with-pydantic-wo-binaries.zip
 12M	powertools-wo-pydantic.zip

In your case, even if we substract the venv size from the module installation, there still remains something there that is much bigger. Now I am curious if we miss anything in our layer, that we did not discovered yet 🤔.

EDIT:

I wanted to rule out OSX dependency and ran the same test on AL2, and the results are weirdly different:

120M.  ./powertools-with-pydantic
83M	   ./powertools-with-pydantic-wo-binaries
79M	    ./powertools-wo-pydantic
24M	   powertools-with-pydantic.zip
12M	   powertools-wo-pydantic.zip
13M	   powertools-with-pydantic-wo-binaries.zip

@Dunedan
Copy link
Contributor Author

Dunedan commented Mar 24, 2022

Interesting, I use -t target param to install the package and the dependencies into the dedicated folder.

Yes, of course, that makes sense.

I wanted to rule out OSX dependency and ran the same test on AL2, and the results are weirdly different:

Ah, so you ran it once on macOS and once on arm64, while I ran it on Linux running on amd64. That explains the differences, as the size of binary files (in this case the ones shipped with pydantic) differ between operating systems and CPU architectures.

Here are my sizes using pip install -t:

$ pip install -t powertools-wo-pydantic -q aws-lambda-powertools
$ zip -qr powertools-wo-pydantic.zip powertools-wo-pydantic/

$ pip install -t powertools-with-pydantic -q aws-lambda-powertools[pydantic]
$ zip -qr powertools-with-pydantic.zip powertools-with-pydantic/

$ SKIP_CYTHON=1 pip install -t powertools-with-pydantic-wo-binary -q --no-binary pydantic aws-lambda-powertools[pydantic]
$ zip -qr powertools-with-pydantic-wo-binary.zip powertools-with-pydantic-wo-binary/

$ du -h --max-depth=1 .
79M     ./powertools-wo-pydantic
124M    ./powertools-with-pydantic
83M     ./powertools-with-pydantic-wo-binary
334M

$ du -h --max-depth=1 *.zip
12M     powertools-wo-pydantic.zip
25M     powertools-with-pydantic.zip
13M     powertools-with-pydantic-wo-binary.zip

So how to proceed here? As the actual Lambda package size also depends on other factors like the compression ratio or size differences between different versions of AWS Lambda Powertools and pydantic, I'm not sure if absolute values in the documentation will ever be accurate. What about a more fuzzy statement like:

This will increase the compressed package size by >10MB due to the Pydantic dependency.

@Dunedan
Copy link
Contributor Author

Dunedan commented Mar 24, 2022

Ah, so you ran it once on macOS and once on arm64, while I ran it on Linux running on amd64.

Sorry, I implied you used AL2 with arm64, which of course doesn't have to be the case.

@am29d
Copy link
Contributor

am29d commented Mar 24, 2022

What about a more fuzzy statement like:

This will increase the compressed package size by >10MB due to the Pydantic dependency.

This is good 👍.

@heitorlessa
Copy link
Contributor

heitorlessa commented Mar 24, 2022 via email

@am29d
Copy link
Contributor

am29d commented Mar 24, 2022

Alex, could we update SAR Extras App as part of this change so everyone benefits from the smaller package size?

Yep, the SAR pipeline is not public, thus the new package size will be rolled out with the next release.

docs/utilities/parser.md Show resolved Hide resolved
@Dunedan
Copy link
Contributor Author

Dunedan commented Mar 25, 2022

I updated the statement about the package size now. There are still the outdated sizes for the Lambda layer in the documentation, however as its build scripts aren't public, I'm not sure how the sizes look now. Feel free to update those numbers as well before merging.

@heitorlessa heitorlessa changed the title docs: Add how to install Pydantic without binaries docs(install): Add how to install Pydantic without binaries Apr 12, 2022
@heitorlessa heitorlessa changed the title docs(install): Add how to install Pydantic without binaries docs(install): instructions to reduce pydantic package size Apr 12, 2022
@heitorlessa heitorlessa merged commit 1cf630b into aws-powertools:develop Apr 12, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Apr 12, 2022

Awesome work, congrats on your first merged pull request and thank you for helping improve everyone's experience!

@heitorlessa
Copy link
Contributor

thank you so much for this @Dunedan !

@bevand10
Copy link

I managed to shrink an overly large pydantic Lambda layer from 91MB to 6.8MB will one small adjustment - removing debug symbols from all shared-object binaries using strip.

$ cat requirements.txt
pydantic<1.9,>=1.7
$
$ mkdir -p src
$
$ python3 -m pip install -r requirements.txt -t ./src
...
$
$ du -sh src/pydantic
91M     src/pydantic
$
$ strip src/pydantic/*so
$
$ du -sh src/pydantic
6.8M    src/pydantic
$

Context:

  • Ubuntu 20.04.5
  • Python 3.8
  • strip GNU Binutils for Ubuntu 2.34

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants