0

I have a ML model (trained in Sklearn) and based on it I have created a Flask web service and hosted it on Windows IIS server.

What is the best practice to load the model? Shall I load the model when we start the API or model should be loading when the request coming?

Case1

import flask
import joblib

app = Flask(__name__)


# load the models
MODELS = joblib.load(model_file)


# endpoints
@app.route("/predictions", methods=["GET", "POST"])
def predictions():
   # some code 

case2

import flask
import joblib

app = Flask(__name__)


# endpoints
@app.route("/predictions", methods=["GET", "POST"])
def predictions():
   # load the model
   model = joblib.load(model_file)

   
Sociopath
  • 1,223
  • 2
  • 11
  • 27

1 Answers1

1

If you load the time every time you have an incoming request, you would be increasing the latency. Usually, you want to minimize request latency, so it would be better to load the model at the beginning and just use it when you fulfil the request. This approach also saves unnecessary duplication.

noe
  • 22,074
  • 1
  • 43
  • 70