3

This is the first time I attempt to use machine learning with Keras. In contrast to others I need to use one of the disadvantage of such algorithms.

I need a function that accepts an angle and distance to an object and output a new angle and power (imagine aiming for an object with a bow for example and the algorithm tells me how far up my arm should go and the bow's power). There's nothing predictive in this configuration. I will generate a large set of 4D (input,output) data with every possible case. I want the AI to "evaluate" some inputs and return the corresponding outputs for that set of inputs, in other words to remember the data and output the same numbers.

I need an AI for this task because I need smooth values between inputs values it has never seen (limited interpolation)

I have used two models:

model = Sequential()
model.add(Dense(12, input_dim=2, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(24, activation='relu'))
model.add(Dense(24, activation='sigmoid'))
model.add(Dense(2, activation='linear'))

Which now I know is incorrect because Sigmoid is used for binary classification. Still, it works! I end up with a mse of 4. I did not manage the get the same loss with all-ReLu layers with the same # of epochs.

model = Sequential()
model.add(Dense(12, input_dim=2, activation='relu'))

model.add(Dense(24, activation='linear'))
model.add(LeakyReLU(alpha=0.1))

model.add(Dense(24, activation='linear'))
model.add(LeakyReLU(alpha=0.1))

model.add(Dense(2, activation='linear'))

This models has a loss of 5.43 after 500 epochs and seems to stabillize here.

Note:

  1. I am forced to retrain the model because I generate the training data very fast. I need to stay with a model that will keep going reduce its loss.
  2. Normalization is worse than anything I have seen.
  3. Because my model should be sensitive to different very close inputs, I tested with batch values from 5 to 2.
  4. My dataset is currently 1100 of lines
  5. This model will be trained very closely to the data I give it. It should not matter how many lines I feed it because I don't want prediction or generalization. I want the AI to output what it has seen for a corresponding set of inputs. That would mean overfitting the AI to the maximum until it reaches a very low loss. Then I can test it for exactly those values it was trained on.

Should I continue with the first model? Does it make sense to use the Sigmoid layer? How can the second model be improved?

Sample of my data:

theta[-90,90], distance [0,40],  theta_output[-90,90] power[0,1,2]
0.0,8.696802,0.25688815116882324,1
-1.990075945854187,5.455038,11.56562614440918,1
-56.81309127807617,3.1364963,-53.07550048828125,1
-38.21211242675781,4.718147,-32.30286407470703,1
-33.828956604003906,5.163292,-35.61191940307617,0
-27.64937973022461,6.182574,-25.107540130615234,1
2.8613548278808594,13.922726,-2.3708770275115967,2
-8.812483787536621,14.951225,-3.919188976287842,2
0.0,21.448895,-3.9320743083953857,2
  • 1
    Did you try a more simple regression method? With only 1100 instances it's not sure that DL gives you the best value for your money. – Erwan Jun 24 '20 at 23:39

1 Answers1

1

The suggestion in the comment is appropriate.

Still, if you want to try NN, you may try these suggestions -

- None of the models seems best as per general guidlines
- Keep ReLu as all the hidden layer, linear for last layer(Regression)
- Standardization/Normalization must be done before training
- Add Batch normalization layer

Can also try
Since your output seems like Classes [0,1,2]. Try a Classification model for power and a Regression model for output angle
For classification - last activation - Softmax, Loss - categorical_cross_entropy, Label - one-hot encoded

10xAI
  • 5,454
  • 2
  • 8
  • 24
  • You mean multi output model? I tried to separate the output angle from the output power and to train the first with regression and the second classification in the same time. I didn't go too far because there were many things to tweak (now I got two models , new leaning rates, double the layers, testing if leaky is better than simple relu,etc..), It also seemed to me to perform worse than my first model which very quickly reached a low loss. – sergiu reznicencu Jun 25 '20 at 10:33
  • About normalization: How can I recover the output and denormalize them? Can I also express the loss value from the training (the one that is shown by keras for every epoch) as a loss between denormalized values (just for reference not for training. I know what a loss of 4 means with current values, but I have no ideea what 0.785 means in normalized values)? – sergiu reznicencu Jun 25 '20 at 10:36
  • Thank you for the advice. It still bothers me that Sigmoid worked so well. Is it common for it to be used in regression? – sergiu reznicencu Jun 25 '20 at 10:44
  • The output need not normalized, only features. Sigmoid works because it was used only once. It causes issue in a deeper network. – 10xAI Jun 25 '20 at 11:08