I want train fasttext unsupervised model on my text dataset. However there are many hyperparameters in train_unsupervised method:
lr # learning rate [0.05]
dim # size of word vectors [100]
ws # size of the context window [5]
epoch # number of epochs [5]
minCount # minimal number of word occurences [5]
minn # min length of char ngram [3]
maxn # max length of char ngram [6]
neg # number of negatives sampled [5]
wordNgrams # max length of word ngram [1]
thread # number of threads [number of cpus]
lrUpdateRate # change the rate of updates for the learning rate [100]
Some of them influence quality of embeddings dramatically (dim, lr, minn, maxn especially). However I haven't found any method for tuning those hyperparameters. How could I do that? And also, how features of my dataset (mean sentence length for example) may influence choice of some of those hyperparameters?