Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support KFServing API V2 predict protocol #899

Closed
parano opened this issue Jul 15, 2020 · 9 comments
Closed

Support KFServing API V2 predict protocol #899

parano opened this issue Jul 15, 2020 · 9 comments
Labels
feature Feature requests or pull request implementing a new feature help-wanted An issue currently lacks a contributor
Projects

Comments

@parano
Copy link
Member

parano commented Jul 15, 2020

About KFServing API V2 predict protocol: https://github.com/kubeflow/kfserving/tree/master/docs/predict-api/v2

The Predict Protocol, version 2 is a set of HTTP/REST and GRPC APIs for inference / prediction servers. By implementing this protocol both inference clients and servers will increase their utility and portability by being able to operate seamlessly on platforms that have standardized around this protocol.

The protocol is composed of a required set of APIs that must be implemented by a compliant server. This required set of APIs is described in required_api.md. The GRPC proto specification for the required APIs is available.

This is to add an option to BentoML API server, that enables a set of special endpoints that are compatible with KFServing API V2 protocol. It will make BentoML API server work much nicer with other tools in the kubeflow eco-system.

@parano parano added help-wanted An issue currently lacks a contributor feature Feature requests or pull request implementing a new feature labels Jul 15, 2020
@yubozhao yubozhao added the MLH label Sep 25, 2020
@pncnmnp
Copy link
Contributor

pncnmnp commented Oct 27, 2020

@yubozhao
Kishore and I are interested in contributing to this issue.

@parano
Copy link
Member Author

parano commented Oct 27, 2020

Hi @pncnmnp @Kishore - it probably makes more sense to first implement gRPC support in bentoML #703 before supporting KFServing V2's predict protocol

@parano parano added this to Next major release in Roadmap via automation Nov 26, 2020
@parano parano moved this from Next major release to Mid-Long Term in Roadmap Nov 26, 2020
@stale
Copy link

stale bot commented Feb 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Feb 2, 2021
@yubozhao yubozhao removed MLH labels Feb 2, 2021
@stale
Copy link

stale bot commented Jun 2, 2021

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Jun 2, 2021
@parano parano removed the stale label Jun 14, 2021
@parano parano closed this as completed Jul 22, 2021
Roadmap automation moved this from Mid-Long Term to Done Jul 22, 2021
@yonil7
Copy link

yonil7 commented Oct 23, 2021

Is this feature already supported?

@parano
Copy link
Member Author

parano commented Oct 23, 2021

@yonil7 not yet, is this feature a blocker for you? would love to learn more

@yonil7
Copy link

yonil7 commented Oct 24, 2021

I was just looking for a standard inference REST API and I was surprised to see there is no such standard.
The closest I could find was KServe predict protocol v2. But I think their infer API is unnecessarily too complex and verbose.

On the other hand, GCP Vertex AI predict API (which is almost identical to GCP AI Platform predict API and Tensorflow Serving predict API) has the exact same features/abilities in a much more elegant API - any single instance / prediction and the request parameters is just a JSON object (can be number, null, bool, string, (nested) list, (nested) object)

@parano
Copy link
Member Author

parano commented Oct 25, 2021

@yonil7 bentoML is trying to establish such a standard, and it takes a very different approach compared to Kserve's protocol. Essentially bentoML defines how an HTTP Request/Response is converted to and from a Python object that data scientist's code will consider as input to their inference function.

@wolvever
Copy link

wolvever commented Aug 9, 2023

KServe predict protocol is supported in KServe, Seldon Core and Triton Inference Server. I think it make sense to support this protocol to allow users switch different frameworks. We are currently using Triton Inference Server to serve our own models for historical reason. And we also want to provide BentoML to our collaborators to simplify serving and deployment. But our product relied on this predict protocol, and we can't provide BentoML now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Feature requests or pull request implementing a new feature help-wanted An issue currently lacks a contributor
Projects
No open projects
Development

No branches or pull requests

5 participants