New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EXPERIMENTAL: gRPC support #2808
Conversation
Codecov Report
@@ Coverage Diff @@
## main #2808 +/- ##
=======================================
Coverage 69.00% 69.00%
=======================================
Files 122 122
Lines 10162 10162
=======================================
Hits 7012 7012
Misses 3150 3150 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Few minor things we can fix in PRs to gRPC.
Hello @aarnphm, Thanks for updating this PR. There are currently no PEP 8 issues detected in this PR. Cheers! π» Comment last updated at 2022-09-13 06:38:30 UTC |
3d0fffe
to
c0b5009
Compare
bd4ac4c
to
7a9f276
Compare
4bdb054
to
8dba459
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
We will generate gRPC stubs via a separate script instead of setuptools. update codespaces and devcontainers configuration ignore pyvenv chore: ignore virtualenv lock protobuf to 3.19.4 Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
interceptor: access logs, prometheus, opentelemetry (#2825) Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
add options to assign alias for commands Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
include gRPC options and dependencies enable alias to be parsed in docker container Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) x2
BentoML π€ gRPC
gRPC is currently an experimental features, and prone to bugs. We would love to hear more feedback from the community. Feel free to join our slack for supports as well as file a bug report if you encounter any π
serve-grpc
CLI entrypointTo use gRPC use
serve-grpc
as an alternativebentoml serve
:bentoml serve-grpc
bentoml serve-grpc --production
By default,
serve-grpc
doesn't enable reflection. To use reflection and take advantages of tools such as https://github.com/fullstorydev/grpcui or https://github.com/fullstorydev/grpcurl, pass--enable-reflection
:Configuration
Under
bentoml_configuration.yaml
file, the given fields are introduced toapi_server.grpc
:Some of notable configuration fields:
max_concurrent_streams
Maximum number of concurrent incoming streams to allow on a HTTP2 connection. Default to 100, See https://httpwg.org/specs/rfc7540.html#rfc.section.5.1.2 for more details.
A gRPC channel uses a single HTTP/2 connection, and concurrent calls are multiplexed on that connection. When the number of active calls reaches the connection stream limit, additional calls are queued in the client. Queued calls wait for active calls to complete before they are sent.
maximum_concurrent_rpcs
The maximum number of concurrent RPCs this server will service before returning RESOURCE_EXHAUSTED status, or None to indicate no limit.
Improvement
This PR also introduce a refactor of configuration. All HTTPs-only fields are now available under
api_server.http
. These options includecors
, andmax_request_size
:Prometheus will be running as a sidecar when using grpc. The default port is set to
50052
, and host to0.0.0.0
. To change the port and host, customizebentoml_configugration.yaml
:CLI aliases
Given that now we have
serve-grpc
andserve
, this PR introduce aliasserve-http
that maps to
serve
:Help message:
Custom gRPC server support
Mount custom gRPC server to
bentoml.Service
:Note that
service_names
is used for our health checking probe to ensure liveliness.Custom gRPC interceptor
BentoML comes with some default interceptors to provide support for access log, OpenTelemetry, Prometheus.
Note that the order of the interceptor is import here. The following graph demonstrate how interceptor flow get added to
bentoml
's gRPC server:To add your interceptor, simply use
svc.add_grpc_interceptor
:If your interceptor requires additional arguments, you can do the following:
For
grpc.ServerInterceptor
(NOT STABLE YET)All BentoML Interceptor are async interceptor and inherit from
grpc.aio.ServerInterceptor
.If your interceptor are sync interceptor (
grpc.ServerInterceptor
), you can do something as follow:Then add it to BentoService:
BentoService
Protobuf representationRequest
Response
An example of
gRPCurl
requestMacOS/Windows
Linux
A toy client implementation in
go
Containerize your gRPC BentoService
To add bentoml additional components, such as gRPC, Tracing (zipkin, jaeger, etc.), a YAML dictionary field
python.components
is available to customize in yourbentofile.yaml
:This fields are currently following BentoML's
extras_require
. This is can be one of [grpc
,tracing
,tracing.zipkin
,tracing.jaeger
,tracing.otlp
].To run your docker container with gRPC, provide either environment variable
BENTOML_USE_GRPC=true
to docker or useserve-grpc
directly to the container:--enable-<components>
flagThis PR also introduces the ability to containerize previously built Bento with additional components via
bentoml containerize
:Known limitation
SO_REUSEPORT
gRPC supports multiple workers out-of-the-box. However, they depends on the socket implementation of SO_REUSEPORT. This is known to have different implementation behaviour on different system (because of its hashing algorithm), and thus we can only ensure gRPC to be functional on Linux-based system.
This wouldn't affect a Bento container since it is running as a Linux container.
However, if users try to run
bentoml serve --production
locally on MacOS or any BSD system, the behaviour would not be the same.Windows support
We have enabled Windows support while in development mode. Since the limitation lies at SO_REUSEPORT in production settings, Windows will not be supported with
bentoml serve --production --grpc
(as grpc themselves doesn't have a good support for Windows).Therefore, we advise our Windows users to use WSL instead. This will give Windows users access to Linux where BentoML gRPC's integration will be fully supported.
Mischelaneous
@experimental
decorator for given functions that are not yet stable.bentoml.testing
Currently there are three
deployment_mode
: [standalone
,docker
,distributed
]. Note that on GitHub CI, BentoML currently run the following matrix:standalone
,distributed
,docker
]standalone
,distributed
]standalone
]Note that for MacOS and Windows running locally,
docker
will also be included.docker
is disabled on CI due to licensing restriction.To run each of the server separately, one can use
run_bento_server_<deployment_mode>
, for example:TODOs:
bentoml.testing.grpc
are currently not thread-safe, hence forking is disabled. (FIXME)Course of actions
main
as an experimental feature. (CI will fail sinceI cherry-picked the tests from this branch)
main
with the tests, which will fix CIconfiguration will be added there.
Kudos
Many thanks to our MLH Intern Sadab Hafiz for contributing to this feature. π