Questions tagged [semantic-similarity]

57 questions
5
votes
3 answers

where to store embeddings for similarity search?

I've asked on stackoverflow already (here), but I figured that the approach of storing embeddings in an ordinary postgres-Database might be flawed from the very beginning. I will shortly etch out the application again: text corpora (few hundred…
Angus
  • 51
  • 1
  • 2
4
votes
1 answer

How to build recommendation model based on resume and job description?

How to build a model which will result in better recommendation of resumes based on the job description given? I am familiar with bow or tfidf (n-grams) approach and then take a cosine similarity but I'm looking for a deep learning approach. I don't…
3
votes
1 answer

Evaluation metric for Information retrieval system

I am currently reading Semantic Product Search paper published by Amazon. They are using two evaluation subtasks matching and ranking. In matching, they tune the model hyperparameters to maximize Recall@100 and Mean Average Precision…
3
votes
1 answer

Why do semantically different words produce similar embeddings?

I am comparing words in HuggingFace web UI using e5-small-v2, one of the best vector embedding models: Assuming that the scores are in the range from 0 to 1, how come all the scores are so high? In fact, I was not able to produce any example with a…
3
votes
2 answers

Semantic search - combine text and image embedding

I have a question regarding combining text and image embeddings for semantic search. The use case is product search on a (B2B) marketplace, we have image(s) and title&description of the products. I want to allow the user to search both the image and…
Steven
  • 31
  • 3
3
votes
1 answer

Is there a reference dataset for contextual similarity?

I'm doing some experiments with word embeddings to try to capture context-aware similarity, so that for example the word pair apple - hardware, are very dissimilar in the context of a fruit store, but very similar in an IT context. My question is if…
Jorgemar
  • 241
  • 1
  • 5
2
votes
1 answer

Cluster words into groups of similar meaning (synonyms)

How can words be clustered into groups of similar meaning (synonyms)? I started with pre-trained word embeddings (e.g., Google News), which is great, but not perfect - a limitation arises because the word embeddings are based on surrounding words.…
Ben
  • 141
  • 2
2
votes
1 answer

Semantic network using word2vec

I have thousands of headlines and I would like to build a semantic network using word2vec, specifically google news files. My sentences look like Titles Dogs are humans’ best friends A dog died because of an accident You can clean dogs’ paws using…
Math
  • 151
  • 13
2
votes
2 answers

Semantic Search

There is a problem we are trying to solve where we want to do semantic search on our set of data, i.e we have a domain specific data (example: sentences talking about automobiles) Our data is just a bunch of sentences and what we want is to give a…
2
votes
1 answer

How do we evaluate the outputs of text generation models?

Evaluation of a wide variety of natural language generation (NLG) tasks is difficult. For instance, for a question answering model, it is hard for a human to quantify how well the model has answered a particular question. Doing this at scale is even…
2
votes
1 answer

How to choose similarity measurement between sentences and paragraphs

Problems 1. How to find appropriate measurement method There are several ways to measure sentence similarities, but I have no idea how to find appropriate method among them for my data (sentences). Related Question on Stack overflow: is there a way…
Mahler
  • 59
  • 6
1
vote
1 answer

BERT Optimization for Production

I'm using BERT to transform text into 768 dim vector, It's multilingual : from sentence_transformers import SentenceTransformer model = SentenceTransformer('paraphrase-multilingual-mpnet-base-v2') Now i want to put the model into production but…
1
vote
1 answer

What's the best way to generate similar words?

Hi all I'm fairly up to date with all the NLP tasks out there (nlpprogress.com, paperswithcode.com) and great tools like (nltk, flair, huggingface etc). I want to take a single word, and predict a similar word, a little like the old "Google Sets"…
Julian H
  • 113
  • 3
1
vote
0 answers

Is it possible to perform Semantic Textual Similarity without using NLTK and Genism?

College restricted us to make projects in object oriented programming languages but without using any other libraries except standard ones. We can not use APIs also
user110377
  • 11
  • 1
1
vote
1 answer

How to determine whether a semantic concept is present in a string

I need to find a way to detect if a large string contains a specific substring. Imagine that I have a full contract page converted to string in my Python program. What I want to do is to say if a specific term (a smaller string than the whole page…
1
2 3 4