What's the intuition behind the hidden states of RNN/LSTM? Are they similar to the hidden states of HMM (Hidden Markov Model)?
2 Answers
Just to add, the hidden state can be described as the working memory of the recurrent network that carries information from immediately previous timesteps/events. This working memory overwrites itself at every step uncontrollably and is present at RNNs and LSTMs.
Given the latter, I appreciate the analogy with markovian framework - in a wider sense. Feel free to check my answer to a similar question for more information on the hidden and cell state architectures in sequence models.
- 1,978
- 7
- 27
I personally don't think they are comparable to the hidden state of a Markov model. One key difference is that, in a HMM you can explain what a given state means to someone, where in a RNN/LSTM you cannot interpret a given state.
The closest thing that you can compare the hidden state of an RNN/LSTM is to think of it as the output of an intermediate layer of a fully-connected neural network but for time-series data.
And the larger the hidden state the more memory it can retain of the past.
- 181
- 3