Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: memory issue when push large bentos #4207

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

fix: memory issue when push large bentos #4207

wants to merge 4 commits into from

Conversation

xianml
Copy link
Contributor

@xianml xianml commented Sep 25, 2023

What does this PR address?

supporting limit max memory usage when pushing models

image

bentoml push facebook--opt-2.7b-service:905a4b602cda5c501f1b3a2650a4152680238254  --maxmemory 2

Test case 1:
pushing bento google--flan-t5-large-service, model size 2.92 GiB

  • no limit
  1. time consumed: 3min 58s
  2. memory usage: ~ 3GB
  • maxmemory = 1
  1. time consumed:4min 25s
  2. memory usage: <1G

Test case 2:
pushing bento google--flan-t5-large-service, model size 12.55 GiB

  • maxmemory = 3
    image
  1. time consumed:4min 48s
  2. memory usage: max ~ 4G

Fixes #(issue)

Before submitting:

@pep8speaks
Copy link

pep8speaks commented Sep 25, 2023

Hello @xianml! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2023-09-26 02:14:35 UTC

@xianml xianml force-pushed the fix/push_oom branch 2 times, most recently from 7b94cf7 to e8490ef Compare September 26, 2023 02:14
@jianshen92
Copy link
Contributor

What are the bugs within the current implementation?

@xianml
Copy link
Contributor Author

xianml commented Oct 18, 2023

What are the bugs within the current implementation?

context in https://bentoml-team.slack.com/archives/C02QLC8RB5W/p1695088745009929

as a summary, currently it will take too much memory when do a bentoml push since it use io.BytesIO. So this fix is to use SpooledTemporaryFile instead to cap the memory usage

@xianml xianml force-pushed the fix/push_oom branch 6 times, most recently from 2a5e36f to e9745a5 Compare October 20, 2023 07:26
@xianml xianml marked this pull request as ready for review October 20, 2023 08:36
@xianml xianml requested a review from a team as a code owner October 20, 2023 08:36
@xianml xianml requested review from ssheng and removed request for a team October 20, 2023 08:36
@xianml xianml force-pushed the fix/push_oom branch 2 times, most recently from 024c804 to ccb008b Compare October 20, 2023 08:39
@sauyon
Copy link
Contributor

sauyon commented Oct 20, 2023

Looks like unit tests are failing, maybe because of the requests change...?

Should we just use a SpooledTemporaryFile for everything?

@aarnphm
Copy link
Member

aarnphm commented Oct 20, 2023

@sauyon you need to change the tests patch requests to httpx. Let's open a separately PR to fix the test?

@xianml
Copy link
Contributor Author

xianml commented Oct 24, 2023

Looks like unit tests are failing, maybe because of the requests change...?

Should we just use a SpooledTemporaryFile for everything?

  • checked the ut failed logs, seems our test case is out of date
  • now, i just use SpooledTemporaryFile for pushing models. Shall we replaced it one by one ? i am not very confident with a big code change.

@xianml xianml removed the request for review from ssheng October 25, 2023 10:14
src/bentoml_cli/bentos.py Outdated Show resolved Hide resolved
Copy link
Member

@aarnphm aarnphm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @Haivilo worth to also test this once the deployment CLI/SDK is more mature and added some test case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants