I trained word embeddings with 300 dimensions. Now, I would like to have word embeddings with 50 dimensions: is it better to retrain the word embeddings with 50 dimensions, or can I use some dimensionality reduction method to scale the word embeddings with 300 dimensions down to 50 dimensions?
Asked
Active
Viewed 1.1k times
10
-
what method of word embedding are you using? – lollercoaster Jul 28 '15 at 20:23
-
@lollercoaster word2vec and GloVe. – Franck Dernoncourt Jul 28 '15 at 20:26
2 Answers
7
There is a paper on this subject called
Simple and Effective Dimensionality Reduction for Word Embeddings, Vikas Raunak
You can read it here
You can also find the implementation here
In my opinion it works quite well
Franck Dernoncourt
- 5,573
- 9
- 40
- 75
Gabriel M
- 171
- 1
- 7
4
t-distributed stochastic neighbor embedding (t-SNE) is often used for dimensionality reduction in word embeddings. t-SNE maintains the relative relationships between the vectors.
Most often t-SNE is used for visualization, thus reducing the dimensions to 2 or 3. It could also reduce the dimensions down to 50.
Brian Spiering
- 20,142
- 2
- 25
- 102