Are there any rules of thumb (or actual rules) pertaining to the minimum, maximum and "reasonable" amount of LSTM cells I should use? Specifically I am relating to BasicLSTMCell from TensorFlow and num_units property.
Please assume that I have a classification problem defined by:
t - number of time steps
n - length of input vector in each time step
m - length of output vector (number of classes)
i - number of training examples
Is it true, for example, that the number of training examples should be larger than:
4*((n+1)*m + m*m)*c
where c is number of cells? I based this on this: How to calculate the number of parameters of an LSTM network? As I understand, this should give the total number of parameters, which should be less than number of training examples.