I'm trying to understand what prior probability shift (label drift) in data means.
If I understand it correctly then it means that distribution of labels in training dataset differs compared to distribution of labels in production environment. This difference causes that ML model trained on such data and deployed to production environment makes poor predictions.
That makes sense.
But then I remembered that one of the techniques how to train ML model on imbalanced dataset is oversampling the minor class or undersampling the major class (so changing labels distribution in the training dataset). But these techniques (oversampling or undersampling) causes that distribution of labels in training dataset differs compared to the production environment (imbalanced data setting). That sounds exactly like label drift setting!
So is my understanding of prior shift in data wrong (most probably yes)?
Are undersampling/oversampling techniques of imbalanced dataset flawed (I don't this so)?
Am I missing something else?
Thank you for explanation.
Tomas