2

Are there any guidelines for choosing the embedding dimension size value in a custom Word2Vec embedding? I know that the default is 100 and that seems just as good as any. But I'm wondering if there is any data out there that creates heuristics from when you should deviate from this value.

I can't imagine that there is much benefit in a smaller size, but there must be some value in a larger size? Or maybe it's related to the size of my vocabulary? I have a relatively small vocabulary for my latest project (7,000 words) so maybe there is some ratio or proportion that I can apply?

desertnaut
  • 1,908
  • 2
  • 13
  • 23
I_Play_With_Data
  • 2,079
  • 2
  • 16
  • 39
  • 1
    You might find this useful: https://datascience.stackexchange.com/questions/51404/word2vec-how-to-choose-the-embedding-size-parameter/51557#51557 – Simon Larsson Jun 25 '19 at 14:24
  • Does this answer your question? [Word2Vec how to choose the embedding size parameter](https://datascience.stackexchange.com/questions/51404/word2vec-how-to-choose-the-embedding-size-parameter) – Brian Spiering Mar 19 '22 at 22:37

0 Answers0