Why is averaging the vectors required in word2vec?

Question

While implementing word2vec using gensim by following few tutorials online, one thing that I couldn't understand is the reason why word vectors are averaged once the model is trained. Few example links below.

My questions are:

Is it just to create a single vector instead of vectors of the dimension size or to increase the accuracy or is there any reason behind this?
Is it mandatory to take average of the vectors or are there any alternatives to this.

I have gone through the original paper on word2vec but that doesn't give clear explanation on this.

score 7 · Accepted Answer · answered Apr 19 '21 at 15:00

The reason to average the embedded vectors of the words in a paragraph or document is to obtain a single fixed-size vector that represents the whole text. Then, the document-level vector can be used as input to a document classification model or any other document-level model.

If you explicitly want to compute word-level representations and then combine them into a document/paragraph-level representation, then averaging is the standard approach.

On the other hand, to obtain document/paragraph/sentence-level representations in general, there are many alternatives to combining word-level vectors. Some remarkable examples include doc2vec for paragraph/document-level, or LASER or BERT for sentence-level representations.

Thanks for your answer. Are there any other ways apart from averaging to get to the fixed size vector to feed it as input to classifier? — mockash, Apr 19 '21 at 16:35
In cases where the text length is very short, it is also possible to concatenate the vectors up to a maximum number of words but, in general, there are no real alternatives to averaging the vectors. — noe, Apr 19 '21 at 16:39

Peter Dimmar · Answer 2 · 2023-08-14T08:51:34.607

0

Here is a link to the paper originally proposing to encode paragraph level information in the embeddings, it seems that this method substantially reduced semantic search errors: https://arxiv.org/pdf/1405.4053v2.pdf

edited Aug 14 '23 at 08:51

answered Aug 14 '23 at 08:51

Peter Dimmar

1
1

1

only url answer – fuwiak Aug 15 '23 at 10:13

Why is averaging the vectors required in word2vec?

2 Answers2