New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support KFServing API V2 predict protocol #899
Comments
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Is this feature already supported? |
@yonil7 not yet, is this feature a blocker for you? would love to learn more |
I was just looking for a standard inference REST API and I was surprised to see there is no such standard. On the other hand, GCP Vertex AI |
@yonil7 bentoML is trying to establish such a standard, and it takes a very different approach compared to Kserve's protocol. Essentially bentoML defines how an HTTP Request/Response is converted to and from a Python object that data scientist's code will consider as input to their inference function. |
KServe predict protocol is supported in KServe, Seldon Core and Triton Inference Server. I think it make sense to support this protocol to allow users switch different frameworks. We are currently using Triton Inference Server to serve our own models for historical reason. And we also want to provide BentoML to our collaborators to simplify serving and deployment. But our product relied on this predict protocol, and we can't provide BentoML now. |
About KFServing API V2 predict protocol: https://github.com/kubeflow/kfserving/tree/master/docs/predict-api/v2
This is to add an option to BentoML API server, that enables a set of special endpoints that are compatible with KFServing API V2 protocol. It will make BentoML API server work much nicer with other tools in the kubeflow eco-system.
The text was updated successfully, but these errors were encountered: