fix: memory issue when push large bentos #4207

xianml · 2023-09-25T10:24:15Z

What does this PR address?

supporting limit max memory usage when pushing models

bentoml push facebook--opt-2.7b-service:905a4b602cda5c501f1b3a2650a4152680238254  --maxmemory 2

Test case 1:
pushing bento google--flan-t5-large-service, model size 2.92 GiB

no limit

time consumed: 3min 58s
memory usage: ~ 3GB

maxmemory = 1

time consumed:4min 25s
memory usage: <1G

Test case 2:
pushing bento google--flan-t5-large-service, model size 12.55 GiB

maxmemory = 3

time consumed:4min 48s
memory usage: max ~ 4G

Fixes #(issue)

Before submitting:

Does the Pull Request follow Conventional Commits specification naming? Here are GitHub's
guide on how to create a pull request.
Does the code follow BentoML's code style, pre-commit run -a script has passed (instructions)?
Did you read through contribution guidelines and follow development guidelines?
Did your changes require updates to the documentation? Have you updated
those accordingly? Here are documentation guidelines and tips on writting docs.
Did you write tests to cover your changes?

pep8speaks · 2023-09-25T10:24:22Z

Hello @xianml! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2023-09-26 02:14:35 UTC

jianshen92 · 2023-09-29T06:56:35Z

What are the bugs within the current implementation?

xianml · 2023-10-18T09:38:19Z

What are the bugs within the current implementation?

context in https://bentoml-team.slack.com/archives/C02QLC8RB5W/p1695088745009929

as a summary, currently it will take too much memory when do a bentoml push since it use io.BytesIO. So this fix is to use SpooledTemporaryFile instead to cap the memory usage

sauyon · 2023-10-20T14:08:47Z

Looks like unit tests are failing, maybe because of the requests change...?

Should we just use a SpooledTemporaryFile for everything?

aarnphm · 2023-10-20T17:51:58Z

@sauyon you need to change the tests patch requests to httpx. Let's open a separately PR to fix the test?

xianml · 2023-10-24T02:49:44Z

Looks like unit tests are failing, maybe because of the requests change...?

Should we just use a SpooledTemporaryFile for everything?

checked the ut failed logs, seems our test case is out of date
now, i just use SpooledTemporaryFile for pushing models. Shall we replaced it one by one ? i am not very confident with a big code change.

src/bentoml_cli/bentos.py

aarnphm

LGTM. @Haivilo worth to also test this once the deployment CLI/SDK is more mature and added some test case.

xianml force-pushed the fix/push_oom branch 2 times, most recently from 7b94cf7 to e8490ef Compare September 26, 2023 02:14

xianml force-pushed the fix/push_oom branch 6 times, most recently from 2a5e36f to e9745a5 Compare October 20, 2023 07:26

xianml marked this pull request as ready for review October 20, 2023 08:36

xianml requested a review from a team as a code owner October 20, 2023 08:36

xianml requested review from ssheng and removed request for a team October 20, 2023 08:36

xianml force-pushed the fix/push_oom branch 2 times, most recently from 024c804 to ccb008b Compare October 20, 2023 08:39

xianml removed the request for review from ssheng October 25, 2023 10:14

aarnphm reviewed Nov 7, 2023

View reviewed changes

src/bentoml_cli/bentos.py Outdated Show resolved Hide resolved

aarnphm mentioned this pull request Nov 8, 2023

infra: update to use Ruff formatter #4269

Merged

aarnphm approved these changes Nov 9, 2023

View reviewed changes

xianml force-pushed the fix/push_oom branch from 2cbde38 to e323f6e Compare November 13, 2023 09:48

aarnphm approved these changes Nov 13, 2023

View reviewed changes

xianml force-pushed the fix/push_oom branch from e323f6e to 83999de Compare November 27, 2023 07:38

xianml added 3 commits December 7, 2023 14:34

fix: memory issue when push large bentos

2108201

fix: add memory option for push

351a3f8

fix: del unused var

8985a30

fix: rename maxmemory to max_memory

f612788

xianml force-pushed the fix/push_oom branch from 83999de to f612788 Compare December 7, 2023 06:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: memory issue when push large bentos #4207

fix: memory issue when push large bentos #4207

xianml commented Sep 25, 2023 •

edited

pep8speaks commented Sep 25, 2023 •

edited

jianshen92 commented Sep 29, 2023

xianml commented Oct 18, 2023

sauyon commented Oct 20, 2023

aarnphm commented Oct 20, 2023

xianml commented Oct 24, 2023 •

edited

aarnphm left a comment

fix: memory issue when push large bentos #4207

Are you sure you want to change the base?

fix: memory issue when push large bentos #4207

Conversation

xianml commented Sep 25, 2023 • edited

What does this PR address?

Before submitting:

pep8speaks commented Sep 25, 2023 • edited

Comment last updated at 2023-09-26 02:14:35 UTC

jianshen92 commented Sep 29, 2023

xianml commented Oct 18, 2023

sauyon commented Oct 20, 2023

aarnphm commented Oct 20, 2023

xianml commented Oct 24, 2023 • edited

aarnphm left a comment

Choose a reason for hiding this comment

xianml commented Sep 25, 2023 •

edited

pep8speaks commented Sep 25, 2023 •

edited

xianml commented Oct 24, 2023 •

edited