What are the advantages of model drift vs concept drift in online learning?

Question

I have asked this question here but I'm also posting it here to get a better insight:

https://stats.stackexchange.com/questions/602282/what-are-the-advantages-of-model-drift-vs-concept-drift-in-online-learning

Let's say I have a simple linear predictor model and I want to update my model to adapt to the changes that have happened in the environment.

I mainly have two tools to detect dynamic changes: 1- Model prediction error (concept drift) 2- Data drift

Which scheme should be used (1 or 2 or both) to update the model to adapt to the changes?

There are several aspects I'm interested in minimizing by updating my model:

1- Faster (or fastest) detection of the changes 2- Minimizing the miss-detection rate 3- Minimizing the processing cost

What are the pros and cons of each method? and when and why to use one over the other?

For example, the model error (concept drift) includes both the embedded error of the parameter estimator in addition to any error due to dynamic changes in the environment.

The issue here is: Is updating the model by detecting data drift superior to just updating the model by purely observing the model's error (concept drift)?

Thanks!

score 2 · Answer 1 · answered Jan 18 '23 at 05:59

Short Answer:

Both data drift and concept drift are important. Detecting them is important. If any is detected you need to re-train. These are different things and you can not say that one is better than the other.

Model performance drift = Data drift and/or Concept drift

Long Answer:

First I need to explain the Concept drift vs. Data drift. both are part of what we call model performance drift. see this link: https://datatron.com/what-is-model-drift/#:~:text=Concept%20drift%20is%20a%20type,s)%20change(s)

First let us explain the two drifts with some examples. let us say you train a model to detect defects on x product, the inputs are images that are captured using specific light condition.

Data drift: any thing the can change the data generating distribution. If the data point is (X, Y), where X is the image and Y is the label, then P(X) is the random function that is generating these images. Any of the following changes can cause a change to this function (note that there is no change to the actual label):

changing the light settings
issue in the camera
changing the material supplier
new products / new pattern
new defects signatures
mechanical problem … etc

Now let us think in the concept drift. In this case the change will be in the Y rather than X. for some reason a specific image we used to consider label Y1, now it is Y2. can this happen? the answer is yes. In industry it is more common than in academia to change concepts. They can decide to go higher quality and consider new things as defects …

Drift detection is just a mechanism to tell you that there is something wrong, think about it as a red flag that the current data distribution does not match your training data distribution.

Why do we care?

Drifts are the enemy of machine learning models. There is no guarantee how your model will behave under drift condition (it is not guaranteed that prediction will be low confidence) … it can be very random !!!!

What to do?

We need to protect our models "Production environment". We can do two things:

1- Auditing: a percentage of decisions made by the model will go to manual review to double check. This can impact the automation percentage (the money gain of the model).

2- Protect the model from out of distribution data: these are algorithms that run at the inputs of the model. if a data point is identified as outlier / anomaly / out of distribution, it will be sent to manual review or will not be processed by the model.

These two techniques can be used when you have the luxury of manual processing to support your model (automation), if it is not the case, you will need to collect a lot of data ahead to make sure that your model is very robust, you can also have some pre-processing to remove any noise or source of drift, but nothing much can be done away from these things.

If you look at papers from CVPR and NIPS from the last three - four years, you will see a great attention to the idea of out of distribution detection. today it is the main enemy that preventing industry from utilizing more and more of machine learning.

What are the advantages of model drift vs concept drift in online learning?

1 Answers1