Questions tagged [fasttext]
16 questions
3
votes
0 answers
Explain FastText model using SHAP values
I have trained fastText model and some fully connected network build on its embeddings. I figured out how to use Lime on it: complete example can be found in Natural Language Processing Is Fun Part 3: Explaining Model Predictions
The idea is clear -…
Mikhail_Sam
- 131
- 4
3
votes
1 answer
FastText Model Explained
I was reading the FastText paper and I have a few questions about the model used for classification. Since I am not from NLP background, some I am unfamiliar with the jargon.
In the figure, what exactly is are the $x_i$? I am not sure what $N$…
Black Jack 21
- 173
- 6
1
vote
0 answers
When are subword ngrams trained in fasttext? (Enriching Word Vectors with Subword Information)
when is the training for subword ngrams done? is it done simultaneously as when the word representation are trained? or is it done at the end, after word representations are created?
fasttext implements this paper where word representations are…
Sid
- 667
- 1
- 5
- 14
1
vote
1 answer
Pre-trained models for finding similar word n-grams
Are there any pre-trained models for finding similar word n-grams, where n>1?
FastText, for instance, seems to work only on unigrams:
from pyfasttext import FastText
model = FastText('cc.en.300.bin')
model.nearest_neighbors('dog', k=2000)
[('dogs',…
dzieciou
- 697
- 1
- 6
- 15
1
vote
1 answer
Initializing weights that are a pointwise product of multiple variables
In two-layer perceptrons that slide across words of text, such as word2vec and fastText, hidden layer heights may be a product of two random variables such as positional embeddings and word embeddings (Mikolov et al. 2017, Section 2.2): $$v_c =…
Witiko
- 111
- 2
1
vote
0 answers
Extracting vectors of FastText own model to use it on a NN
I have trained my own model of fasttext using the pretrained model of English available on their website with the next code:
from gensim.models.fasttext import load_facebook_model
mod =…
IMB
- 111
- 3
1
vote
0 answers
Removing duplicate records before training
I am currently working on a project classifying text into classes. The specific problem is classifying job titles into various industry codes. For example "McDonalds Employee" might get classified to 11203 (there are a few hundred classes in the…
astel
- 347
- 1
- 5
1
vote
0 answers
How does FastText create n-gram word features?
In the paper Bag of Tricks for Efficient Text Classification they talk about creating n-gram (word) features, and in their experiments they show results for both 1-gram and bi-gram.
As far as I understand FastText it is simply wordembedding based on…
CutePoison
- 450
- 2
- 8
1
vote
1 answer
How to fine-tune hyperameters of unsupervised training in fasttext?
I want train fasttext unsupervised model on my text dataset. However there are many hyperparameters in train_unsupervised method:
lr # learning rate [0.05]
dim # size of word vectors [100]
ws #…
Ir8_mind
- 183
- 4
1
vote
0 answers
Should I use Pad Sequence when using Word Vectors?
I have an unbalanced text data set. I want to use word vectors to embed words. When I use pad sequence? Before or after the word vector? I tried it, after the word vector I used pad sequence but my model accuracy was low. When I use the pad sequence…
grace
- 13
- 4
0
votes
0 answers
Finetuning fasttext with unlabeled text corpus
I am training a classifier which is supposed to take the name of a product as input. For this purpose I want to finetune a pre-existing fasttext model on my article names.
My code looks like this
import fasttext
# Load the pre-trained…
christallclear
- 11
- 1
0
votes
0 answers
Encountered a problem while installing [ FastText ] library on MacOS
I have been trying to install the "FastText" library on macOS but I keep encountering a Runtime error.
System - MacOS: 13.0.1 (22A400)
Python Version: 3.10
IDE: Pycharm
I tried installing it from Pycharm but it did not work, then i tried using…
Mira
- 1
0
votes
0 answers
- Models to rank sentences
I am working with tasks made by some occupations and am trying to find out the importance of these tasks within the occupation. My solution was to use tf-idf and then text rank and use word2vec and text rank. I would like to know if you guys have…
Benimaru cpt
- 1
- 1
0
votes
1 answer
Is it normal for a model to perform worse with the use of word embeddings?
I have a multiclass text classification problem and I've tried different solutions and models, but I was not satisfied with the results.
So I've decided to use GloVe ( Global Vectors for Word Representation ) , but somehow all the models performed…
HasanArcas
- 13
- 3
0
votes
1 answer
Training fasttext on your own corpus
I want to train fasttext on my own corpus. However, I have a small question before continuing. Do I need each sentences as a different item in corpus or can I have many sentences as one item?
For example, I have this DataFrame:
text …
BlueMango
- 113
- 3