Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

avoid sending content past the last token during batching / batched streaming #57

Open
aniketmaurya opened this issue Apr 23, 2024 · 0 comments

Comments

@aniketmaurya
Copy link
Collaborator

aniketmaurya commented Apr 23, 2024

Find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in encode_response, let's open an issue and create an example about it)

          as a more immediate improvement, we need to find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in `encode_response`, let's open an issue and create an example about it)

Originally posted by @lantiga in #55 (comment)

@aniketmaurya aniketmaurya changed the title find a way to avoid sending a lot of tokens past the last token for a particular item in the batch (i.e. we need to trim past the EOS in encode_response, let's open an issue and create an example about it) avoid sending too much content during batched streaming Apr 23, 2024
@aniketmaurya aniketmaurya mentioned this issue Apr 23, 2024
5 tasks
@lantiga lantiga changed the title avoid sending too much content during batched streaming avoid sending content past the last token during batching / batched streaming Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant