Short Answer:
Both data drift and concept drift are important. Detecting them is important. If any is detected you need to re-train. These are different things and you can not say that one is better than the other.
Model performance drift = Data drift and/or Concept drift
Long Answer:
First I need to explain the Concept drift vs. Data drift.
both are part of what we call model performance drift. see this link:
https://datatron.com/what-is-model-drift/#:~:text=Concept%20drift%20is%20a%20type,s)%20change(s)
First let us explain the two drifts with some examples. let us say you train a model to detect defects on x product, the inputs are images that are captured using specific light condition.
Data drift: any thing the can change the data generating distribution. If the data point is (X, Y), where X is the image and Y is the label, then P(X) is the random function that is generating these images. Any of the following changes can cause a change to this function (note that there is no change to the actual label):
- changing the light settings
- issue in the camera
- changing the material supplier
- new products / new pattern
- new defects signatures
- mechanical problem … etc
Now let us think in the concept drift.
In this case the change will be in the Y rather than X. for some reason a specific image we used to consider label Y1, now it is Y2. can this happen? the answer is yes. In industry it is more common than in academia to change concepts. They can decide to go higher quality and consider new things as defects …
Drift detection is just a mechanism to tell you that there is something wrong, think about it as a red flag that the current data distribution does not match your training data distribution.
Why do we care?
Drifts are the enemy of machine learning models. There is no guarantee how your model will behave under drift condition (it is not guaranteed that prediction will be low confidence) … it can be very random !!!!
What to do?
We need to protect our models "Production environment". We can do two things:
1- Auditing: a percentage of decisions made by the model will go to manual review to double check. This can impact the automation percentage (the money gain of the model).
2- Protect the model from out of distribution data: these are algorithms that run at the inputs of the model. if a data point is identified as outlier / anomaly / out of distribution, it will be sent to manual review or will not be processed by the model.
These two techniques can be used when you have the luxury of manual processing to support your model (automation), if it is not the case, you will need to collect a lot of data ahead to make sure that your model is very robust, you can also have some pre-processing to remove any noise or source of drift, but nothing much can be done away from these things.
If you look at papers from CVPR and NIPS from the last three - four years, you will see a great attention to the idea of out of distribution detection. today it is the main enemy that preventing industry from utilizing more and more of machine learning.