Skip to content

Commit

Permalink
docs: tracing and configuration
Browse files Browse the repository at this point in the history
depends on bentoml#3052

Signed-off-by: Aaron Pham <29749331+aarnphm@users.noreply.github.com>
  • Loading branch information
aarnphm committed Oct 5, 2022
1 parent 1533da5 commit c4c69a8
Show file tree
Hide file tree
Showing 5 changed files with 227 additions and 35 deletions.
150 changes: 115 additions & 35 deletions docs/source/guides/configuration.rst
Expand Up @@ -2,36 +2,60 @@
Configuring BentoML
===================

BentoML starts with an out-of-the-box configuration that works for common use cases. For advanced users, many
features can be customized through configuration. Both BentoML CLI and Python APIs can be customized
by the configuration. Configuration is best used for scenarios where the customizations can be specified once
and applied to the entire team.
BentoML provides a configuration interface that allows you to customize the runtime
behaviour of your BentoService. This article highlight and consolidates the configuration
fields definition, as well as some of recommendation for best-practice when configuring
your BentoML.

BentoML configuration is defined by a YAML file placed in a directory specified by the ``BENTOML_CONFIG``
environment variable. The example below starts the bento server with configuration defined in ``~/bentoml_configuration.yaml``:
Configuration is best used for scenarios where the customizations can be specified once
and applied anywhere among your organization using BentoML.

.. code-block:: shell
BentoML comes with out-of-the-box configuration that should work for most use cases.

$ BENTOML_CONFIG=~/bentoml_configuration.yaml bentoml serve iris_classifier:latest
However, for more advanced users who wants to fine-tune the feature suites BentoML has to offer,
users can configure such runtime variables and settings via a configuration file, often referred to as
``bentoml_configuration.yaml``.

Users only need to specify a partial configuration with only the properties they wish to customize instead
of a full configuration schema. In the example below, the microbatching workers count is overridden to 4.
Remaining properties will take their defaults values.
.. note::

This is not to be **mistaken** with the ``bentofile.yaml`` which is used to define and
package your :ref:`Bento 🍱 <concepts/bento:What is a Bento?>`

This configuration file are for BentoML runtime configuration.

Providing configuration during serve runtime
--------------------------------------------

BentoML configuration is a :wiki:`YAML` file which can then be specified via the environment variable ``BENTOML_CONFIG``.

For example, given the following ``bentoml_configuration.yaml`` that specify that the
server should only use 4 workers:

.. code-block:: yaml
:caption: `~/bentoml_configuration.yaml`
api_server:
workers: 4
timeout: 60
http:
port: 6000
version: 2
api_server:
workers: 4
Said configuration then can be parsed to :ref:`bentoml serve <reference/cli:serve>` like
below:

.. code-block:: bash
Throughout the BentoML documentation, features that are customizable through configuration are demonstrated
like the example above. For a full configuration schema including all customizable properties, refer to
the BentoML configuration template defined in :github:`default_configuration.yml <bentoml/BentoML/blob/main/bentoml/_internal/configuration/default_configuration.yaml>`.
» BENTOML_CONFIG=~/bentoml_configuration.yaml bentoml serve iris_classifier:latest --production
.. note::

Users will only have to specify a partial configuration with properties they wish to customize. BentoML
will then fill in the rest of the configuration with the default values.

In the example above, the number of API workers count is overridden to 4.
Remaining properties will take their defaults values.

.. seealso::

:ref:`guides/configuration:Configuration fields`


Overrding configuration with environment variables
Expand Down Expand Up @@ -63,25 +87,81 @@ Which the override configuration will be intepreted as:
:alt: Configuration override environment variable


Docker Deployment
-----------------
Mounting configuration to containerized Bento
---------------------------------------------

To mount a configuration file to a containerized BentoService, user can use the
|volume_mount|_ option to mount the configuration file to the container and
|env_flag|_ option to set the ``BENTOML_CONFIG`` environment variable:

.. code-block:: bash
$ docker run --rm -v /path/to/configuration.yml:/home/bentoml/configuration.yml \
-e BENTOML_CONFIG=/home/bentoml/configuration.yml \
iris_classifier:6otbsmxzq6lwbgxi serve --production
Voila! You have successfully mounted a configuration file to your containerized BentoService.

.. _env_flag: https://docs.docker.com/engine/reference/commandline/run/#set-environment-variables--e---env---env-file

.. |env_flag| replace:: ``-e``

.. _volume_mount: https://docs.docker.com/storage/volumes/#choose-the--v-or---mount-flag

.. |volume_mount| replace:: ``-v``


Configuration fields
--------------------

This section defines the configuration specs for BentoML.

BentoML configuration provides a versioning specs, which enables users to easily specify
and upgrade their configuration file as BentoML evolves. One can specify the version of
the configuration file by adding a top level ``version`` field to ``bentoml_configuration.yaml``:

.. code-block:: yaml
:caption: `~/bentoml_configuration.yaml`
version: 2
.. epigraph::

Note that ``version`` is not a required field, and BentoML will default to version 1 if
it is not specified. This is mainly for backward compatibility with older configuration.
However, we encourage users to always use the latest version of BentoML to ensure the best experience.

On the top level, BentoML configuration is split into two sections:

* ``api_server``: Configuration for BentoML API server.

* ``runners``: Configuration for BentoService runners.

.. tab-set::

.. tab-item:: version 2
:sync: v2

.. include:: ./snippets/configuration/v2.rst

.. tab-item:: version 1
:sync: v1

.. include:: ./snippets/configuration/v1.rst

Configuration file can be mounted to the Docker container using the `-v` option and specified to the BentoML
runtime using the `-e` environment variable option.
.. dropdown:: `Expands for default configuration`
:icon: code

.. code-block:: shell
.. tab-set::

$ docker run -v /local/path/configuration.yml:/home/bentoml/configuration.yml -e BENTOML_CONFIG=/home/bentoml/configuration.yml
.. tab-item:: version 2
:sync: v2

.. literalinclude:: ../../../bentoml/_internal/configuration/v2/defaults.yaml
:language: yaml

.. spelling::
.. tab-item:: version 1
:sync: v1

customizations
microbatching
customizable
multiproc
dir
tls
apiserver
uri
gcs
.. literalinclude:: ../../../bentoml/_internal/configuration/v1/defaults.yaml
:language: yaml
3 changes: 3 additions & 0 deletions docs/source/guides/grpc.rst
Expand Up @@ -1342,6 +1342,7 @@ A quick overview of the available configuration for gRPC:
``max_concurrent_streams``
^^^^^^^^^^^^^^^^^^^^^^^^^^

.. epigraph::
:bdg-info:`Definition:` Maximum number of concurrent incoming streams to allow on a HTTP2 connection.

By default we don't set a limit cap. HTTP/2 connections typically has limit of `maximum concurrent streams <httpwg.org/specs/rfc7540.html#rfc.section.5.1.2>`_
Expand Down Expand Up @@ -1370,6 +1371,7 @@ on a connection at one time.
``maximum_concurrent_rpcs``
^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. epigraph::
:bdg-info:`Definition:` The maximum number of concurrent RPCs this server will service before returning ``RESOURCE_EXHAUSTED`` status.

By default we set to ``None`` to indicate no limit, and let gRPC to decide the limit.
Expand All @@ -1379,6 +1381,7 @@ By default we set to ``None`` to indicate no limit, and let gRPC to decide the l
``max_message_length``
^^^^^^^^^^^^^^^^^^^^^^

.. epigraph::
:bdg-info:`Definition:` The maximum message length in bytes allowed to be received on/can be send to the server.

By default we set to ``-1`` to indicate no limit.
Expand Down
Empty file.
11 changes: 11 additions & 0 deletions docs/source/guides/snippets/configuration/v2.rst
@@ -0,0 +1,11 @@
``api_server``
^^^^^^^^^^^^^^

The following options are available for the ``api_server`` section:

+-------------+-------------------------------------+-----------------+
| Option | Description | Default |
+-------------+-------------------------------------+-----------------+
| ``workers`` | Number of API workers for to spawn | None (which will be determined by BentoML)
+-------------+-------------------------------------+-----------------+
``timeout``
98 changes: 98 additions & 0 deletions docs/source/guides/snippets/configuration/v2/api_server.yaml
@@ -0,0 +1,98 @@
api_server:
workers: ~ # cpu_count() will be used when null
timeout: 60
backlog: 2048
metrics:
enabled: true
namespace: bentoml_api_server
duration:
# https://github.com/prometheus/client_python/blob/f17a8361ad3ed5bc47f193ac03b00911120a8d81/prometheus_client/metrics.py#L544
buckets:
[
0.005,
0.01,
0.025,
0.05,
0.075,
0.1,
0.25,
0.5,
0.75,
1.0,
2.5,
5.0,
7.5,
10.0,
]
min: ~
max: ~
factor: ~
logging:
access:
enabled: true
request_content_length: true
request_content_type: true
response_content_length: true
response_content_type: true
format:
trace_id: 032x
span_id: 016x
ssl:
enabled: false
certfile: ~
keyfile: ~
keyfile_password: ~
ca_certs: ~
version: 17 # ssl.PROTOCOL_TLS_SERVER
cert_reqs: 0 # ssl.CERT_NONE
ciphers: TLSv1 # default ciphers
http:
host: 0.0.0.0
port: 3000
cors:
enabled: false
allow_origin: ~
allow_credentials: ~
allow_methods: ~
allow_headers: ~
allow_origin_regex: ~
max_age: ~
expose_headers: ~
grpc:
host: 0.0.0.0
port: 3000
max_concurrent_streams: ~
maximum_concurrent_rpcs: ~
max_message_length: -1
reflection:
enabled: false
metrics:
host: 0.0.0.0
port: 3001
tracing:
exporter_type: ~
sample_rate: ~
excluded_urls: ~
timeout: ~
max_tag_value_length: ~
zipkin:
endpoint: ~
jaeger:
protocol: thrift
collector_endpoint: ~
thrift:
agent_host_name: ~
agent_port: ~
udp_split_oversized_batches: ~
grpc:
insecure: ~
otlp:
protocol: ~
endpoint: ~
compression: ~
http:
certificate_file: ~
headers: ~
grpc:
headers: ~
insecure: ~

0 comments on commit c4c69a8

Please sign in to comment.