Highest Voted 'fasttext' Questions - Data Science Stack Exchange

3

votes

0 answers

Explain FastText model using SHAP values

I have trained fastText model and some fully connected network build on its embeddings. I figured out how to use Lime on it: complete example can be found in Natural Language Processing Is Fun Part 3: Explaining Model Predictions The idea is clear -…

asked Jun 18 '20 at 14:09

Mikhail_Sam

131
4

3

votes

1 answer

FastText Model Explained

I was reading the FastText paper and I have a few questions about the model used for classification. Since I am not from NLP background, some I am unfamiliar with the jargon. In the figure, what exactly is are the $x_i$? I am not sure what $N$…

nlp ngrams fasttext

asked May 28 '20 at 11:28

Black Jack 21

173
6

1

vote

0 answers

When are subword ngrams trained in fasttext? (Enriching Word Vectors with Subword Information)

when is the training for subword ngrams done? is it done simultaneously as when the word representation are trained? or is it done at the end, after word representations are created? fasttext implements this paper where word representations are…

machine-learning nlp word-embeddings fasttext

asked Mar 24 '21 at 07:36

Sid

667
1
5
14

1

vote

1 answer

Pre-trained models for finding similar word n-grams

Are there any pre-trained models for finding similar word n-grams, where n>1? FastText, for instance, seems to work only on unigrams: from pyfasttext import FastText model = FastText('cc.en.300.bin') model.nearest_neighbors('dog', k=2000) [('dogs',…

nlp fasttext

asked Jan 05 '21 at 09:34

dzieciou

697
1
6
15

1

vote

1 answer

Initializing weights that are a pointwise product of multiple variables

In two-layer perceptrons that slide across words of text, such as word2vec and fastText, hidden layer heights may be a product of two random variables such as positional embeddings and word embeddings (Mikolov et al. 2017, Section 2.2): $$v_c =…

nlp word-embeddings word2vec weight-initialization fasttext

asked Sep 23 '20 at 13:18

Witiko

111
2

1

vote

0 answers

Extracting vectors of FastText own model to use it on a NN

I have trained my own model of fasttext using the pretrained model of English available on their website with the next code: from gensim.models.fasttext import load_facebook_model mod =…

lstm word-embeddings word2vec gensim fasttext

asked Jun 10 '20 at 17:21

IMB

111
3

1

vote

0 answers

Removing duplicate records before training

I am currently working on a project classifying text into classes. The specific problem is classifying job titles into various industry codes. For example "McDonalds Employee" might get classified to 11203 (there are a few hundred classes in the…

nlp overfitting fasttext

asked May 14 '20 at 18:47

astel

347
1
5

1

vote

0 answers

How does FastText create n-gram word features?

In the paper Bag of Tricks for Efficient Text Classification they talk about creating n-gram (word) features, and in their experiments they show results for both 1-gram and bi-gram. As far as I understand FastText it is simply wordembedding based on…

ngrams fasttext

asked Mar 09 '23 at 14:44

CutePoison

450
2
8

1

vote

1 answer

How to fine-tune hyperameters of unsupervised training in fasttext?

I want train fasttext unsupervised model on my text dataset. However there are many hyperparameters in train_unsupervised method: lr # learning rate [0.05] dim # size of word vectors [100] ws #…

machine-learning nlp word-embeddings hyperparameter-tuning fasttext

asked Sep 06 '22 at 13:52

Ir8_mind

183
4

1

vote

0 answers

Should I use Pad Sequence when using Word Vectors?

I have an unbalanced text data set. I want to use word vectors to embed words. When I use pad sequence? Before or after the word vector? I tried it, after the word vector I used pad sequence but my model accuracy was low. When I use the pad sequence…

python neural-network class-imbalance word-embeddings fasttext

asked Nov 21 '21 at 16:03

grace

13
4

0

votes

0 answers

Finetuning fasttext with unlabeled text corpus

I am training a classifier which is supposed to take the name of a product as input. For this purpose I want to finetune a pre-existing fasttext model on my article names. My code looks like this import fasttext # Load the pre-trained…

nlp word-embeddings finetuning fasttext

asked Feb 26 '23 at 18:17

christallclear

11
1

0

votes

0 answers

Encountered a problem while installing [ FastText ] library on MacOS

I have been trying to install the "FastText" library on macOS but I keep encountering a Runtime error. System - MacOS: 13.0.1 (22A400) Python Version: 3.10 IDE: Pycharm I tried installing it from Pycharm but it did not work, then i tried using…

nlp error-handling fasttext

asked Dec 28 '22 at 13:46

Mira

1

0

votes

0 answers

- Models to rank sentences

I am working with tasks made by some occupations and am trying to find out the importance of these tasks within the occupation. My solution was to use tf-idf and then text rank and use word2vec and text rank. I would like to know if you guys have…

word2vec tfidf ranking fasttext

asked Oct 04 '22 at 19:43

Benimaru cpt

1
1

0

votes

1 answer

Is it normal for a model to perform worse with the use of word embeddings?

I have a multiclass text classification problem and I've tried different solutions and models, but I was not satisfied with the results. So I've decided to use GloVe ( Global Vectors for Word Representation ) , but somehow all the models performed…

machine-learning nlp word-embeddings fasttext

asked Aug 17 '22 at 16:45

HasanArcas

13
3

0

votes

1 answer

Training fasttext on your own corpus

I want to train fasttext on my own corpus. However, I have a small question before continuing. Do I need each sentences as a different item in corpus or can I have many sentences as one item? For example, I have this DataFrame: text …

python tensorflow word-embeddings gensim fasttext

asked Oct 15 '21 at 10:51

BlueMango

113
3

Questions tagged [fasttext]