You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# https://github.com/bentoml/BentoML/blob/main/src/bentoml/_internal/frameworks/mlflow.py#L246classMLflowPyfuncRunnable(bentoml.Runnable):
# The only case that multi-threading may not be supported is when user define a# custom python_function MLflow model with pure python code, but there's no way# of telling that from the MLflow model metadata. It should be a very rare case,# because most custom python_function models are likely numpy code or model# inference with pre/post-processing code.SUPPORTED_RESOURCES= ("cpu", )
SUPPORTS_CPU_MULTI_THREADING=True
...
have you try this?
classTestSentenceBert(_sbert_runnable): # overrideSUPPORTED_RESOURCES= ("gpu",) # <--- need to add "gpu" force to find gpuSUPPORTS_CPU_MULTI_THREADING=Truedef__init__(self):
super().__init__()
@bentoml.Runnable.method(batchable=True, batch_dim=0)defpredict(self, sentences: List[str]):
output=super().predict(sentences)
returnoutput
Describe the bug
i am save bento with mlflow
(sentence transformers)
and below is service.py
how can i use gpu~?
model not found and
i want model.to("cuda:0")
To reproduce
No response
Expected behavior
No response
Environment
newest version
The text was updated successfully, but these errors were encountered: