Questions tagged [representation]

42 questions
9
votes
2 answers

Are there any graph embedding algorithms like this already?

I wrote an algorithm for generating node embeddings based on the graph's topology. Most of the explanation is done in the readme file and the examples. The question is: Am I reinventing the wheel? Does this approach have any practical advantages…
monomonedula
  • 201
  • 1
  • 2
8
votes
2 answers

Why don't tree ensembles require one-hot-encoding?

I know that models such as random forest and boosted trees don't require one-hot encoding for predictor levels, but I don't really get why. If the tree is making a split in the feature space, then isn't there an inherent ordering involved? There…
5
votes
1 answer

What does 'Linear regularities among words' mean?

Context: In the paper "Efficient Estimation of Word Representations in Vector Space" by T. Mikolov et al., the authors make use of the phrase: 'Linear regularities among words'. What does that mean in the context of the paper, or in a general…
Dawny33
  • 8,226
  • 12
  • 47
  • 104
4
votes
1 answer

What are latent representations?

I am reading some research papers about graph convolutional neural networks and I have seen the term "latent representations" used a lot. For instance, "the model was able to learn latent representations of the nodes of the graph". What does the…
4
votes
1 answer

What does it exactly mean when we say that PCA and LDA are linear methods of learning data representation?

I have been reading on representation learning and I have come across this idea that PCA and LDA are linear methods of data representation, however, auto-encoders provide a non-linear way. Does this mean that the embedding learned by PCA can be…
Ankita Talwar
  • 307
  • 1
  • 10
4
votes
1 answer

Why ELMo's word embedding can represent the word better than glove?

I have read the code of ELMo. Based on my understanding, ELMo first init an word embedding matrix A for all the word and then add LSTM B, at end use the LSTM B's outputs to predict each word's next word. I am wondering why we can input each word in…
DunkOnly
  • 661
  • 1
  • 7
  • 16
2
votes
2 answers

KNN efficient implementation

The KNN algorithm is very handy and particularly suited to some of my problems, but I can't find any resources on how to implement it in production. As a comparative example, when I use a neural network, I already have at my disposal high-level…
2
votes
1 answer

Difference between NCE-Loss and InfoNCE-Loss

I started looking into word2vec and was wondering what the connection/difference between the NCE-Loss and the infoNCE-Loss is. I get the basic idea of both. I have a hard time deriving one from another, do you have any idea ? Thank you in advance!
2
votes
1 answer

What are the Most Dissimilar MNIST Digits?

Using whatever definition of dissimilarity over sets that you'd like, what are the most dissimilar two digits in MNIST? I was thinking that a reasonable approach to answering the question would be to pass the two sets through some state-of-the-art…
JoeTheShmoe
  • 121
  • 2
2
votes
2 answers

Is it possible to compress a sequence of numbers through an autoencoder?

Specifically: I would like compress a set of coordinates, which map to the locations of 1's in a binary image, and then decode back to the original set. For instance, for a 16x16 image, the input might be something like the following: [5, 4], [12,…
HighVoltage
  • 175
  • 5
2
votes
1 answer

Learning Football Player Stats like FIFA's by only the game result

It is a general question on how to learning representation of one entity but the dataset is mixed with a lot of other entities, which their statis are always waiting to be learnt. The question is best be explained by an example. Let's say, the…
1
vote
2 answers

Using categorical and continuous variables in Deep Learning

I would like to apply a MLP to some business seller data. I found that the data is a mix of both categorical and continuous features. For what I read it is not advisable to feed a neural network with both types of data (reference…
Lila
  • 217
  • 2
  • 7
1
vote
1 answer

Good chromosome representation in a VRPTW genetic algorithm

I have a genetic algorithm for a vehicle routing problem with time windows and I need to implement certain modifications. I am not sure what would be the best chromosome representations. I have tasks which can be divided into 3 sub-tasks with…
1
vote
1 answer

Clustering with categorical as well as numerical features

I have dataset consisting of house prices for example. The dataset contains features such as: house size, monthly rent, house colour, location, year the house was built. I wanted to group these all attributes into clusters. The problem is how to…
user102751
  • 11
  • 1
1
vote
0 answers

Representation sample size- n

Need help with identifying a representation sample size 'n'. Let's say I have a very large population- infinite number of participants. I am picking the random sample from this infinite population. I want to get the most accurate results about my…
1
2 3