Highest Voted 'embeddings' Questions - Data Science Stack Exchange

16

votes

2 answers

One Hot Encoding vs Word Embedding - When to choose one or another?

A colleague of mine is having an interesting situation, he has quite a large set of possibilities for a defined categorical feature (+/- 300 different values) The usual data science approach would be to perform a One-Hot Encoding. However, wouldn't…

asked Apr 03 '18 at 14:13

Jonathan DEKHTIAR

590
2
5
10

11

votes

1 answer

Confusion about Entity Embeddings of Categorical Variables - Working Example!

Problem Statement: I have problem making the Entity Embedding of Categorical Variable works for a simple dataset. I have followed the original github, or paper, or other blogposts[1,2,or this 3], or this Kaggle kernel; still not working. Data Part:…

python neural-network keras categorical-data embeddings

asked Dec 16 '18 at 22:06

TwinPenguins

4,157
3
17
53

10

votes

2 answers

What is the difference between and Embedding Layer and an Autoencoder?

I'm reading about Embedding layers, especially applied to NLP and word2vec, and they seem nothing more than an application of Autoencoders for dimensionality reduction. Are they different? If so, what are the differences between them?

nlp word2vec word-embeddings dimensionality-reduction embeddings

asked Jun 21 '19 at 15:52

Leevo

6,005
3
14
51

9

votes

2 answers

Are there any graph embedding algorithms like this already?

I wrote an algorithm for generating node embeddings based on the graph's topology. Most of the explanation is done in the readme file and the examples. The question is: Am I reinventing the wheel? Does this approach have any practical advantages…

python numpy graphs embeddings representation

asked Nov 20 '19 at 17:08

monomonedula

201
1
2

8

votes

1 answer

Difference between Gensim word2vec and keras Embedding layer

I used the gensim word2vec package and Keras Embedding layer for various different projects. Then I realize they seem to do the same thing, they all try to convert a word into a feature vector. Am I understanding this properly? What exactly is the…

keras word2vec word-embeddings gensim embeddings

asked Oct 11 '19 at 13:25

Edamame

2,705
5
23
32

7

votes

2 answers

Concatenating embedding and hand-designed features for logistic regression

An interviewer told me that we cannot concatenate an embedding from a neural network (such as a pre-trained image representation) and hand designed features (such as image metadata) for use in a linear model such as logistic regression. He says…

neural-network deep-learning logistic-regression embeddings

asked Jan 07 '21 at 21:29

plutothepup

71
1

7

votes

1 answer

How much text is enough to train a good embedding model?

I need to train a word2vec embedding model on Wikipedia articles using Gensim. Eventually, I will use the entire Wikipedia for that but for the moment, I'm doing some experimentation/optimization to improve the model quality and I was wondering how…

nlp training word-embeddings performance embeddings

asked Feb 10 '18 at 22:58

Abdulrahman Bres

221
2
15

6

votes

1 answer

What is the neural network architecture behind Facebook's Starspace model?

Recently, Facebook released a paper concerning a general purpose neural embedding model called StarSpace. In their paper, they explain the loss function and the training procedure of the model, but they don't emphasize much on the architecture of…

deep-learning word-embeddings embeddings

asked Nov 18 '18 at 12:08

ChiPlusPlus

545
2
5
14

6

votes

1 answer

Unordered Input

I was just wondering what the best approach is for training a neural network (or any other machine learning algorithm) where the order of the inputs does not matter. For example: f(x1,x2,x3,x4) = f(x2,x1,x3,x4) = f(x2,x4,x1,x3) My current approach…

machine-learning neural-network keras embeddings

asked Jun 01 '17 at 09:10

simeon

173
4

5

votes

1 answer

Tensorflow: how to look up and average a different amount of embedding vectors per training instance, with multiple training instances per minibatch?

In a recommender system setting: let's say I want to learn to predict future item purchases based on user past purchases using an approach inspired by Youtube's recommender system: Concretely, let's say I have a trainable content-based network that…

tensorflow embeddings

asked Sep 27 '18 at 22:49

Pablo Messina

377
2
10

4

votes

2 answers

Can we use embeddings or latent vectors for a recommender system?

I'm having a hard time understanding why people use any vector they find as a candidate for a recommender system. In my mind, a recommender system requires a space where distance represents similarity. Of course, before you can construct such a…

machine-learning recommender-system word-embeddings embeddings

asked Feb 14 '21 at 17:15

Mehran

267
1
2
12

4

votes

1 answer

Why are character level models considered less effective than word level models?

I have read that character level models need more computation power than word embeddings, and this is one of the major reasons for their less effectiveness, but i got curious because the word embeddings need a huge vocabulary while character level…

rnn word-embeddings embeddings

asked Jun 10 '20 at 15:10

yashdk

41
1

4

votes

1 answer

word2vec word embeddings creates very distant vectors, closest cosine similarity is still very far, only 0.7

I started using gensim's FastText to create word embeddings on a large corpus of a specialized domain (after finding that existing open source embeddings are not performing well on this domain), although I'm not using its character level n-grams, so…

word2vec word-embeddings gensim embeddings cosine-distance

asked May 31 '19 at 10:35

Oren Matar

221
1
7

4

votes

2 answers

Auto-Encoder to condense (pre-process) large one-hot input vectors?

In my 3D game there are 300 categories to which a creature can belong. I would like to teach my RL agent to make decisions based on its 10 closest monsters So far, my Neural Network input vector is a concatenation of ten 300-dimensional one-hot…

reinforcement-learning embeddings

asked Aug 06 '18 at 23:38

Kari

2,686
1
17
47

3

votes

0 answers

Combining heterogeneous numerical and text features

We want to solve a regression problem of the form "given two objects $x$ and $y$, predict their score (think about it as a similarity) $w(x,y)$". We have 2 types of features: For each object, we have about 1000 numerical features, mainly of the…

neural-network regression decision-trees bert embeddings

asked Jul 21 '21 at 13:05

Dmitry

81
5

Questions tagged [embeddings]