Serving PyTorch Models with Kaptain SDK
Kaptain supports PyTorch model serving with TorchServe. This requires a specific project files organization in order to load and serve previously exported model. This page contains various details of the TorchServe-specific configuration.
Defining PyTorch Models
TorchServe requires a model definition and the state_dict in order to load and initialize a model for serving.
From the project organization perspective this means that the trainer code and the model class should be defined
in different files. For example: model.py
for the model class, and trainer.py
for the trainer code.
The trainer code will contain an import of the model python module so that the same model definition is used for training, tuning, and serving. Check the PyTorch notebook tutorial for examples of the model and the trainer files.
It is important to include the model file as a dependency via Model.extra_files
so that is present in the
target Docker image and is available for import. For example:
model = Model(
<other properties>
main_file="trainer.py",
extra_files=["model.py"],
)
Transfer Learning
For cases when a model extends another pre-trained model (for example,
from torchvision.models), the model class should
extend an existing model and override its __init__
function to introduce desired customizations. It is
also important to make sure that all the required imports are in place. For example:
import torch.nn as nn
from torchvision.models.resnet import ResNet, BasicBlock
class ImageClassifier(ResNet):
def __init__(self):
super(ImageClassifier, self).__init__(BasicBlock, [2,2,2,2], num_classes=10)
self.fc = nn.Sequential(
nn.Linear(512 * BasicBlock.expansion, 128),
nn.ReLU(),
nn.Dropout(.2),
nn.Linear(128, 10),
nn.LogSoftmax(dim=1)
)
Configuration options for model serving
Kaptain Model
expects a dictionary of string keys and values for serving options specification.
Example:
model = Model(
id="dev/mnist",
name="MNIST",
description="MNIST Model",
version="0.0.1",
framework="pytorch",
framework_version="1.7.1",
main_file="trainer.py",
extra_files=["model.py"],
base_image="base_repository/image:tag",
image_name="target_repository/image:tag",
serving_config={
"model_file": "path/to/model.py",
"handler": "image_classifier",
"serialized_file": "path/to/model_state_dict.pt",
"extra_files": "path/to/file_a.py,path/to/file_b.py,",
"requirements_file": "path/to/serving_requirements.txt",
"config_properties": "path/to/config.properties",
},
requirements="requirements.txt",
)
See the detailed explanation of the properties below.
MAR model packaging
TorchServe requires models to be packaged in a MAR (Model Archive) format. Kaptain SDK produces a MAR artifact and supports the same properties as the MAR Archiver. Visit model-archiver#artifact-details for more details about the contents of the archive.
The following serving_config
properties are used to provide paths to the artifacts:
model_file
- (mandatory) a path to the file with a model definition (for example,model.py
handler
- (mandatory) either a name of a TorchServe’s built-in handler, or a path to a file with a custom inference logic. Examples:image_classifier
(built-in),handler.py
(custom). See TorchServe documentation for details.serialized_file
- (optional) a path to thestate_dict
saved locally (for example,model.pt
). If not provided, the SDK will download the modelstate_dict
exported at thetrain
ortune
phase.extra_files
- (optional) auxiliary files required for inference (for example,utils.py
). If not set explicitly, the SDK will use theModel.extra_files
value.requirements_file
- (optional) auxiliary files required for inference (for example,utils.py
). If not set explicitly, the SDK will use theModel.requirements
value.
TorchServe configuration options
TorchServe requires a config.properties
file to provided alongside the model archive in order to load
the model and start serving it. An example config.properties
file looks as follows:
inference_address=http://0.0.0.0:8085
management_address=http://0.0.0.0:8081
number_of_netty_threads=4
service_envelope=kfserving
install_py_dep_per_model=true
job_queue_size=10
model_store=/mnt/models/model-store
model_snapshot={"name": "startup.cfg", "modelCount": 1, "models": {"<YOUR_MODEL_NAME>": {"1.0": {"defaultVersion": true, "marName": "<YOUR_MODEL_NAME>.mar", "minWorkers": 1, "maxWorkers": 5, "batchSize": 1, "maxBatchDelay": 5000, "responseTimeout": 120}}}}
Consult TorchServe documentation for the full list of the supported properties.
Kaptain SDK generates the config.properties
with a specific model name and MAR name unique for the
model during the deploy
step. To customize the server configuration, consult
TorchServe documentation
for the full list of the supported properties.
It is possible to generate a default config.properties
specific for your Model
by running the
following commands:
import os
from kaptain.model.models import Model
from kaptain.platform.serving.packaging.artifact_builder import TorchServingArtifactBuilder
import kaptain.utilities as util
...
# Kaptain Model instantiation
model = Model(...)
...
model_name = util.sanitize_string(model.id)
TorchServingArtifactBuilder.write_config_properties(model_name, f"{model_name}.mar", os.getcwd())
The code above will generate a valid config.properties
file and write it to the current working
directory so it can be modified and provided to the model server.
To provide a path to a custom config.properties
file, add config_properties
to the
serving_config
dictionary:
serving_config={
"config_properties": "path/to/config.properties",
}