1

I have news articles in my dataset containing named entities. I want to use the Wikipedia2vec model to encode the article's named entities. But some of the entities (around 40%) from our dataset articles are not present in Wikipedia. Please suggest to me, how can I use the Wikipedia2vec model for embedding my article named entities efficiently with the help of the article?

sajankar9
  • 11
  • 2

1 Answers1

0

Do you have any specific use for entity embeddings? It is not possible to get Wikipedia2vec embeddings for the entities not present in the Wikipedia dump on which those are trained.

Instead, I would encourage you to look at BLINK project from Facebook AI research. It is a 2-staged entity linking framework with the first stage being individual encoders for embedding entities and mentions separately.

You can take your newer entities and pass them through the entity encoder and obtain embeddings. All you need to obtain an entity embedding is the Wikipedia title and a brief description.

You can fine-tune these entity embeddings by training on your task-specific objective.