Questions tagged [semantic-similarity]
57 questions
5
votes
3 answers
where to store embeddings for similarity search?
I've asked on stackoverflow already (here), but I figured that the approach of storing embeddings in an ordinary postgres-Database might be flawed from the very beginning. I will shortly etch out the application again:
text corpora (few hundred…
Angus
- 51
- 1
- 2
4
votes
1 answer
How to build recommendation model based on resume and job description?
How to build a model which will result in better recommendation of resumes based on the job description given?
I am familiar with bow or tfidf (n-grams) approach and then take a cosine similarity but I'm looking for a deep learning approach. I don't…
user_12
- 347
- 2
- 10
3
votes
1 answer
Evaluation metric for Information retrieval system
I am currently reading Semantic Product Search paper published by Amazon. They are using two evaluation subtasks matching and ranking. In matching, they tune the model hyperparameters to
maximize Recall@100 and Mean Average Precision…
Sayali Sonawane
- 2,001
- 3
- 12
- 13
3
votes
1 answer
Why do semantically different words produce similar embeddings?
I am comparing words in HuggingFace web UI using e5-small-v2, one of the best vector embedding models:
Assuming that the scores are in the range from 0 to 1, how come all the scores are so high? In fact, I was not able to produce any example with a…
AlwaysLearning
- 131
- 2
3
votes
2 answers
Semantic search - combine text and image embedding
I have a question regarding combining text and image embeddings for semantic search. The use case is product search on a (B2B) marketplace, we have image(s) and title&description of the products. I want to allow the user to search both the image and…
Steven
- 31
- 3
3
votes
1 answer
Is there a reference dataset for contextual similarity?
I'm doing some experiments with word embeddings to try to capture context-aware similarity, so that for example the word pair apple - hardware, are very dissimilar in the context of a fruit store, but very similar in an IT context.
My question is if…
Jorgemar
- 241
- 1
- 5
2
votes
1 answer
Cluster words into groups of similar meaning (synonyms)
How can words be clustered into groups of similar meaning (synonyms)?
I started with pre-trained word embeddings (e.g., Google News), which is great, but not perfect - a limitation arises because the word embeddings are based on surrounding words.…
Ben
- 141
- 2
2
votes
1 answer
Semantic network using word2vec
I have thousands of headlines and I would like to build a semantic network using word2vec, specifically google news files.
My sentences look like
Titles
Dogs are humans’ best friends
A dog died because of an accident
You can clean dogs’ paws using…
Math
- 151
- 13
2
votes
2 answers
Semantic Search
There is a problem we are trying to solve where we want to do semantic search on our set of data,
i.e we have a domain specific data (example: sentences talking about automobiles)
Our data is just a bunch of sentences and what we want is to give a…
Farhaan Bukhsh
- 31
- 3
2
votes
1 answer
How do we evaluate the outputs of text generation models?
Evaluation of a wide variety of natural language generation (NLG) tasks is difficult. For instance, for a question answering model, it is hard for a human to quantify how well the model has answered a particular question. Doing this at scale is even…
Greggs
- 121
- 2
2
votes
1 answer
How to choose similarity measurement between sentences and paragraphs
Problems
1. How to find appropriate measurement method
There are several ways to measure sentence similarities, but I have no idea how to find appropriate method among them for my data (sentences).
Related Question on Stack overflow: is there a way…
Mahler
- 59
- 6
1
vote
1 answer
BERT Optimization for Production
I'm using BERT to transform text into 768 dim vector, It's multilingual :
from sentence_transformers import SentenceTransformer
model = SentenceTransformer('paraphrase-multilingual-mpnet-base-v2')
Now i want to put the model into production but…
Mohy Mohamed
- 45
- 3
1
vote
1 answer
What's the best way to generate similar words?
Hi all I'm fairly up to date with all the NLP tasks out there (nlpprogress.com, paperswithcode.com) and great tools like (nltk, flair, huggingface etc). I want to take a single word, and predict a similar word, a little like the old "Google Sets"…
Julian H
- 113
- 3
1
vote
0 answers
Is it possible to perform Semantic Textual Similarity without using NLTK and Genism?
College restricted us to make projects in object oriented programming languages but without using any other libraries except standard ones. We can not use APIs also
user110377
- 11
- 1
1
vote
1 answer
How to determine whether a semantic concept is present in a string
I need to find a way to detect if a large string contains a specific substring.
Imagine that I have a full contract page converted to string in my Python program. What I want to do is to say if a specific term (a smaller string than the whole page…