Wrapping multiple services into a convenience endpoint #1355

gregd33 · 2020-12-22T16:14:49Z

gregd33
Dec 22, 2020

Suppose I have services X, Y and Z. They each operate on the same type of input (e.g. a DataFrame or an Image) and they each produce the same output (e.g JSON).

Now, suppose that these services are often (but not always) called together - i.e. most times, we want the results of X, Y and Z together. Of course, I could simply ask clients of the service to call all three separately and be done with it. However, if I wanted to provide a service that did this for them, what would be the best way to do this?

The options I've thought of are:

a new bento service XYZ that basically just reproduces X, Y and Z in one service. This is obviously a bad idea for a lot of reasons such as code duplication and resource duplication (i.e. same model running in many places). It has an advantage of being simple to implement and also reduce duplication of preprocessing (e.g. if we preprocess the image in each case, then we can do this just once).
a bento service XYZ that just calls X, Y and Z. In this case, I essentially would be implementing the same code the client would and take the input, send to each service and combine the output to send back. The main downside of this is that it's configuration requires some hard coding of where it expects X, Y and Z to be available. e.g There is no way to specify at run time the ports I expect to see X, Y and Z on. I'm also not sure how things like batching would be handled.
Create the XYZ service and within create endpoints for X, Y and Z separately (e..g predict_X, predict_Y). This would then eliminate code duplication and potentially resource duplication (if two end points use the same artifact, I'm hoping it is only loaded once). The main downside would be coupling X/Y/Z

Any thoughts/suggestions on best practice for such a problem?

parano · 2020-12-22T22:48:50Z

parano
Dec 22, 2020
Maintainer

Have you looked at @withsmilo's post here? #1260

I think the idea is to provide a Router concept in BentoML that will be able to allow users to compose results from multiple bento services. Similar to your 2nd option, but we will provide some APIs to make it easier to build.

Another very common way to do this is to implement a simple "backend-for-frontend" layer, essentially a standalone API server that calls X, Y, Z under the hood. GraphQL is one of the popular ways to do it these days. If your team already has something like that, you can try to re-use it as well, assuming the number of models and the overall graph is not being changed frequently.

3 replies

parano Dec 22, 2020
Maintainer

In case you haven't seen it, you can also create a BentoService that contains multiple models and multiple endpoints if you are always going to deploy them together - see pseudo code below for your case:

@artifacts([Model_X, Model_Y, Model_Z])
class MyService(BentoService):
    @api(input=DataframeInput())
    def api_x(self, df):
          return self.artifacts.MODEL_X.predict(df)
 
    @api(input=DataframeInput())
    def api_y(self, df):
           return self.artifacts.MODEL_Y.predict(df)

    @api(input=DataframeInput())
    def api_z(self, df):
           return self.artifacts.MODEL_Z.predict(...)

    @api(input=JsonInput())
    def api_xyz(self, json):
           x_result = self.artifacts.MODEL_X.predict(json['x_input'])
           y_result = self.artifacts.MODEL_Y.predict(json['y_input'])
           z_result = self.artifacts.MODEL_Z.predict(json['z_input'])
           return [x_result, y_result, z_result]

Also, see discussions here for more examples like this #928 (comment)

gregd33 Dec 23, 2020
Author

For the other use cases, we actually do have something running using Dagster to do this backend processing. However, this infrastructure is a bit more complex/has some overhead to set up. Since my use case (4 in his list) is a bit simpler, I was wondering about the simpler possibilities.

And that code is sort of what I had in mind with my 3rd option. I think that seems most sensible for now, at least until the API functions to make the "composing" a bit easier are in place. The comments in #928 are perfect and exactly what I had in mind. Your examples of calling the localhost deployment are helpful - it didn't occur to me that passing the batch processing on would be that simple.

Thanks for the quick reply (as usual!).

gregd33 Jan 4, 2021
Author

@parano I have a follow up question. Not sure whether it is better to create a new thread but for now I'll ask here. It is related to what is shown in #928 , in particular calling the other service via requests. In that, you show taking in an ImageInput and passing it directly to another endpoint. It doesn't seem to work - passing the image_list directly as data (I have changed the URL to be correct). I don't necessarily expect that it would since ImageInput is expecting a file/bytes like object not a list of arrays. So what is the way to do this, if it's possible.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BentoML

Wrapping multiple services into a convenience endpoint #1355

{{title}}

Replies: 1 comment 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

Select a reply

BentoML

Wrapping multiple services into a convenience endpoint #1355

gregd33 Dec 22, 2020

Replies: 1 comment · 3 replies

parano Dec 22, 2020 Maintainer

parano Dec 22, 2020 Maintainer

gregd33 Dec 23, 2020 Author

gregd33 Jan 4, 2021 Author

gregd33
Dec 22, 2020

Replies: 1 comment 3 replies

parano
Dec 22, 2020
Maintainer

parano Dec 22, 2020
Maintainer

gregd33 Dec 23, 2020
Author

gregd33 Jan 4, 2021
Author