Deploying multiple pre-trained model (tar.gz files) on Sagemaker in a single endpoint

Question

We have followed the following steps:

Trained 5 TensorFlow models in local machine using 5 different training sets.
Saved those in .h5 format.
Converted those into tar.gz (Model1.tar.gz,...Model5.tar.gz) and uploaded it in the S3 bucket.
Successfully deployed a single model in an endpoint using the following code:

from sagemaker.tensorflow import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = tarS3Path + 'model{}.tar.gz'.format(1),
                                  role = role, framework_version='1.13',
                                  sagemaker_session = sagemaker_session)
predictor = sagemaker_model.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')
predictor.predict(data.values[:,0:])

The output was: {'predictions': [[153.55], [79.8196], [45.2843]]}

Now the problem is that we cannot use 5 different deploy statements and create 5 different endpoints for 5 models. For this we followed two approaches:

i) Used MultiDataModal of Sagemaker

from sagemaker.multidatamodel import MultiDataModel
sagemaker_model1 = MultiDataModel(name = "laneMultiModels", model_data_prefix = tarS3Path,
                                 model=sagemaker_model, #This is the same sagemaker_model which is trained above
                                  #role = role, #framework_version='1.13',
                                  sagemaker_session = sagemaker_session)
predictor = sagemaker_model1.deploy(initial_instance_count=1,
                                   instance_type='ml.m4.xlarge')
predictor.predict(data.values[:,0:], target_model='model{}.tar.gz'.format(1))

Here we got an error at deploy stage which is as follows: An error occurred (ValidationException) when calling the CreateModel operation: Your Ecr Image 763104351884.dkr.ecr.us-east-2.amazonaws.com/tensorflow-inference:1.13-cpu does not contain required com.amazonaws.sagemaker.capabilities.multi-models=true Docker label(s).

ii) Created endpoint manually

import boto3
import botocore
import sagemaker
sm_client = boto3.client('sagemaker')
image = sagemaker.image_uris.retrieve('knn','us-east-2')
container = {
    "Image": image,
    "ModelDataUrl": tarS3Path,
    "Mode": "MultiModel"
}
# Note if I replace "knn" by tensorflow it gives an error at this stage itself
response = sm_client.create_model(
              ModelName        = 'multiple-tar-models',
              ExecutionRoleArn = role,
              Containers       = [container])
response = sm_client.create_endpoint_config(
    EndpointConfigName = 'multiple-tar-models-endpointconfig',
    ProductionVariants=[{
        'InstanceType':        'ml.t2.medium',
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName':            'multiple-tar-models',
        'VariantName':          'AllTraffic'}])
response = sm_client.create_endpoint(
              EndpointName       = 'tarmodels-endpoint',
              EndpointConfigName = 'multiple-tar-models-endpointconfig')

Endpoint couldn't be created in this approach as well.

How did you convert the model from h5 to tar.gz? – abhishah901 May 12 '21 at 13:39 — abhishah901, May 12 '21 at 13:39

score 2 · Accepted Answer · answered Sep 16 '20 at 08:00

I also have been looking for answers regarding this before, and after several days of trying with my friend, we manage to do it. I attach some code snippet that we use, you may modify it according to your use case

image = '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.2.0-cpu'
container = { 
    'Image': image,
    'ModelDataUrl': model_data_location,
    'Mode': 'MultiModel'
}

sagemaker_client = boto3.client('sagemaker')

# Create Model
response = sagemaker_client.create_model(
              ModelName = model_name,
              ExecutionRoleArn = role,
              Containers = [container])

# Create Endpoint Configuration
response = sagemaker_client.create_endpoint_config(
    EndpointConfigName = endpoint_configuration_name,
    ProductionVariants=[{
        'InstanceType': 'ml.t2.medium',
        'InitialInstanceCount': 1,
        'InitialVariantWeight': 1,
        'ModelName': model_name,
        'VariantName': 'AllTraffic'}])

# Create Endpoint
response = sagemaker_client.create_endpoint(
              EndpointName = endpoint_name,
              EndpointConfigName = endpoint_configuration_name)

# Invoke Endpoint
sagemaker_runtime_client = boto3.client('sagemaker-runtime')

content_type = "application/json" # The MIME type of the input data in the request body.
accept = "application/json" # The desired MIME type of the inference in the response.
payload = json.dumps({"instances": [1.0, 2.0, 5.0]}) # Payload for inference.
target_model = 'model1.tar.gz'


response = sagemaker_runtime_client.invoke_endpoint(
    EndpointName=endpoint_name, 
    ContentType=content_type,
    Accept=accept,
    Body=payload,
    TargetModel=target_model,
)

response

also, make sure your model tar.gz files have this structure

└── model1.tar.gz
     └── <version number>
         ├── saved_model.pb
         └── variables
            └── ...

more info regarding this

After using the above piece I am getting "ModelError". The error message states ": Failed to establish a new connection: [Errno 111] Connection refused". IAM policies for the lambda seems fine. Can you suggest what can be done? "{ "Sid": "VisualEditor0", "Effect": "Allow", "Action": "sagemaker:InvokeEndpoint", "Resource": "*" }" this is added in lambda policy JSON. Also, I can invoke endpoints while I deploy them individually as I've mentioned in the question code piece — Subh2608, Sep 18 '20 at 08:54
I can invoke endpoints while I deploy them individually using sagemaker built in tensorflow instead of docker. Need your suggestions.. — Subh2608, Sep 18 '20 at 09:01
I'm getting the same error , How did you fix this @Subh2608 ? — Khubaib Raza, Jun 27 '21 at 19:16
I deployed multiple models separately on multiple emd points as a work around — Subh2608, Jun 29 '21 at 08:38

score 0 · Answer 2 · answered Apr 02 '22 at 13:13

0

Simply deploy a multi-model endpoint.

https://docs.aws.amazon.com/sagemaker/latest/dg/multi-model-endpoints.html

answered Apr 02 '22 at 13:13

user134147

1

Your answer could be improved with additional supporting information. Please [edit] to add further details, such as citations or documentation, so that others can confirm that your answer is correct. You can find more information on how to write good answers [in the help center](/help/how-to-answer). – Community Apr 02 '22 at 23:36

Deploying multiple pre-trained model (tar.gz files) on Sagemaker in a single endpoint

2 Answers2