The proper loss function for regression that prediction values do not lie on one side of the real values

Question

I'm doing a prediction task using machine learning. First I'm doing a regression task, then I use the values to predict its class.

I used MSE as loss function. However, my prediction values are generally smaller than real values. It will produce the same MSE if the prediction values are generally larger than real values. And it will produce the same MSE, if some prediction values are larger and some are smaller than real values. Apparently I prefer the last situation. Because the real values are symmetric around y=0, so the last situation will yield the same labels as real values.

So I would like to know how should I set my loss function to fulfill such needs?

score 0 · Answer 1 · answered Jul 06 '22 at 20:47

A ML model is supposed to have representative data in order to fit with reality.

But if your training data is generally smaller than the real one, why don't you multiply the training data with a value that lowers it (ex: 0.8)?

In this case, your model should work better in the real situation, but be careful to choose the right value.

Note that you can set this value randomly (ex: between 0.7 and 1.1) with a specific distribution to simulate the real data.

Eventually, your modified training data and your real data should have very similar specifications (standard deviation, mean value, min value, etc.).

The proper loss function for regression that prediction values do not lie on one side of the real values

1 Answers1