-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add openai spec v0 #76
Conversation
Thanks for assembling @aniketmaurya I'm taking this for a spin |
I'm making progress here, slowly but surely :-) I'll be working more on this tomorrow and pass it on, there's still a few kinks I need to fully flash out (feel free to explore of course @aniketmaurya) |
@aniketmaurya if you can look at the test_readme.py failures that would be great |
sure @lantiga, taking a look. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #76 +/- ##
===================================
- Coverage 81% 81% -0%
===================================
Files 8 13 +5
Lines 537 750 +213
===================================
+ Hits 434 606 +172
- Misses 103 144 +41 |
for more information, see https://pre-commit.ci
This now works from transformers import pipeline
import litserve as ls
class HuggingFaceLitAPI(ls.LitAPI):
def setup(self, device):
self.generator = pipeline('text-generation', model='gpt2', device=device)
def predict(self, prompt):
out = self.generator(prompt)
return out[0]["generated_text"]
if __name__ == '__main__':
api = HuggingFaceLitAPI()
server = ls.LitServer(api, accelerator='auto', spec=ls.specs.openai.OpenAISpec())
server.run(port=8000) import requests
response = requests.post("http://127.0.0.1:8000/v1/chat/completions", json={
"model": "No models available",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
})
print(f"Status: {response.status_code}\nResponse:\n {response.text}") |
This PR allows to expose models through the OpenAI API spec
Here's an example
server.py
and the corresponding
client.py
: