Serverless Model Serving with DJL

Overview

It's quite complicated to host a deep learning model and usually the cost is high as well. AWS Lambda provides a low cost and low maintenance solution. However, deploying DL models with Lambda is pretty challenging:

DL framework binary is big, it is hard to package it into a standalone zip file for AWS Lambda.
Because a Python DL framework usually contains multiple dependencies, managing dependencies is non-trivial.
DL model files are usually large, packing these models is difficult.

In this demo, we are going to show you how Deep Java Library (DJL) resolve above issues.

The Lambda Function we are creating is an image classification application that predicts labels along with their probabilities using a pre-trained PyTorch model.

Preparation

You need to install aws cli on your system
Configure your aws cli with credential and region
Setup Java environment

Build and deploy to AWS

Run the following command to deploy to AWS:

cd lambda-model-serving

# for Linux/macOS:
./gradlew deploy

# for Windows:
..\..\gradlew deploy

Above command will create:

a S3 bucket, the bucket name will be stored in bucket-name.txt file
a cloudformation stack named djl-lambda, a template file named out.yml will also be created
a Lambda Function named DJL-Lambda

Invoke Lambda Function

aws lambda invoke --function-name DJL-Lambda --payload '{"inputImageUrl":"https://djl-ai.s3.amazonaws.com/resources/images/kitten.jpg"}' build/output.json

cat build/output.json

The output will be stored in output.json file:

[
  {
    "className": "n02123045 tabby, tabby cat",
    "probability": 0.48384541273117065
  },
  {
    "className": "n02123159 tiger cat",
    "probability": 0.20599405467510223
  },
  {
    "className": "n02124075 Egyptian cat",
    "probability": 0.18810519576072693
  },
  {
    "className": "n02123394 Persian cat",
    "probability": 0.06411759555339813
  },
  {
    "className": "n02127052 lynx, catamount",
    "probability": 0.01021555159240961
  }
]

Clean up

Use the following command to clean up resources created in your AWS account:

./cleanup.sh

Design choices

Minimize package size

DJL can download deep learning framework at runtime.

With this auto detection dependency, the final .zip file is less then 3M. The extracted native library file will be stored in /tmp folder.

Model loading

DJL ModelZoo design allows you to deploy model in three ways:

Bundle the model in .zip file
Load models from your own model zoo
Load models from S3 bucket. DJL supports SageMaker trained model (.tar.gz) format.

In this demo, we are using DJL built-in PyTorch model zoo. By default, it uses resnet18 model.

aws lambda invoke --function-name DJL-Lambda --payload '{"inputImageUrl":"https://djl-ai.s3.amazonaws.com/resources/images/kitten.jpg"}' build/output.json

Limitations

AWS Lambda has the following limitations:

GPU instance is not yet available
512 MB /tmp limit
Slow startup if not frequently used

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Serverless Model Serving with DJL

Overview

Preparation

Build and deploy to AWS

Invoke Lambda Function

Clean up

Design choices

Minimize package size

Model loading

Limitations

Files

README.md

Latest commit

History

README.md

File metadata and controls

Serverless Model Serving with DJL

Overview

Preparation

Build and deploy to AWS

Invoke Lambda Function

Clean up

Design choices

Minimize package size

Model loading

Limitations