How can I fix this "convex" problem ? Is it just a matter of overfitting?

Question

I get some metrics on validation data while training a model , and in my case the they are :

(0.25, 0.31, 0.46, 0.57, 0.65, 0.75, 0.77, 0.78, 0.84, 0.84, 0.85, 0.84, 0.84, 0.84, 0.82, 0.8, 0.8, 0.79, 0.78, 0.77, 0.77, 0.77, 0.75, 0.74, 0.73, 0.73, 0.73, 0.73, 0.73, 0.73)

They can described like this :

In my view , the ideal result should be like :

Is it a matter of overfitting ?

Unfortunately , I tried few times to change the regular coefficients to avoid overfitting , and adjust learning rate coefficients to slow down , but it was still "convex" .

How can I achieve the ideal result showed above ?

Much appreciated if anyone would give me some constructive tips ?

Size of dataset? Type of validation? Still see same effect in k-fold cross-validation? — Neil Slater, Dec 09 '16 at 09:37
I used the "toy data" to validate my recommendation system with Spark , so I simplified the whole process . The size of dataset is small (15K) . I tried few tens of times to adjust coefficients , and almost each result came out to be "convex" . — joe, Dec 09 '16 at 11:04

score 1 · Answer 1 · answered Dec 09 '16 at 08:58

Yes, what you are seeing is a classic case of overfitting.

You stated that you use a linear model such as logistic regression. To regularize these types of models, usually L1 and/or L2 regularization is applied. L1 regularization is simply $||W||_1$ and L2 is $||W||_2^2$ usually.

Another method is to alter the labels of the model in a specific way, which is a method of regularization I created (shameless plug). Here is the link to the paper: https://arxiv.org/abs/1609.06693

Hope this helps.

score 0 · Answer 2 · answered Nov 19 '22 at 05:45

0

just do cross-validation between test-ds & validation-ds -- to select the moment in iterations, when overfitting starts in order to ignore all further iterations

answered Nov 19 '22 at 05:45

JeeyCi

121
4

How can I fix this "convex" problem ? Is it just a matter of overfitting?

2 Answers2

Linked