I'm using AWS Sage Maker to build my model. I want to store the model in S3 for later use. How do you save your model in S3 with Amazon Sage Maker? I know this seems trivial but I didn't understand the sources/documentation I've read.
2 Answers
You can use pickle (or any other format to serialize your model) and boto3 library to save your model to s3.
To save your model as a pickle file you can use:
import pickle
import numpy as np
from sklearn.linear_model import LinearRegression
X = np.array([[1, 1], [1, 2], [2, 2], [2, 3]])
y = np.dot(X, np.array([1, 2])) + 3
model = LinearRegression().fit(X, y)
# save the model to disk
pkl_filename = 'pickle_model.pkl'
with open(pkl_filename, 'wb') as file:
pickle.dump(model, file)
and to save your model as a pickle file to s3, rather than the sagemaker's local:
# to save the model to s3
import boto3
# For aws credentials, if ~/.aws/credentials is missing
# access_key_id = '...'
# secret_access_key = '...'
# session = boto3.Session(
# aws_access_key_id=access_key_id ,
# aws_secret_access_key=secret_access_key,)
# s3_resource = session.resource('s3')
s3_resource = boto3.resource('s3')
bucket='your_bucket'
key= 'pickle_model.pkl'
pickle_byte_obj = pickle.dumps(model)
s3_resource.Object(bucket,key).put(Body=pickle_byte_obj)
- 135
- 8
To expand on the other answer: this is a problem that I've run into several times myself, and so I've built an open source modelstore library that automates this step - as well as doing other things like versioning the model, and storing it in s3 with structured paths.
The code to use it looks like this (there is a full example here):
from modelstore import ModelStore
# Train your model, as usual
model = LinearRegression()
model.fit(X, y)
# Create a model store that points to your s3 bucket
bucket_name = "your-bucket-name"
modelstore = ModelStore.from_aws_s3(bucket_name)
# Upload your model
model_domain = "your-model-domain"
modelstore.sklearn.upload(model_domain, model=model)
This will dump your model to a file, create a tar archive from it, and then upload that to s3 for you. The function returns some meta-data as a dictionary; this includes the version ID for your model.
- 211
- 1
- 3