While implementing word2vec using gensim by following few tutorials online, one thing that I couldn't understand is the reason why word vectors are averaged once the model is trained. Few example links below.
My questions are:
- Is it just to create a single vector instead of vectors of the dimension size or to increase the accuracy or is there any reason behind this?
- Is it mandatory to take average of the vectors or are there any alternatives to this.
I have gone through the original paper on word2vec but that doesn't give clear explanation on this.