Questions tagged [learning-to-rank]
31 questions
6
votes
1 answer
How to use ndcg metric for binary relevance
I am working on a ranking problem to predict the right single document based on the user query and use the NDCG metric to measure the model.
Given the details :
Queries ( Q ), Result Document ( D ), Relevance score.
But the relevance score is a…
kannandreams
- 61
- 4
4
votes
2 answers
Why does it not need to set test group when using 'rank:pairwise' in xgboost?
I'm new for learning-to-rank. I'm trying to learn the Learning to rank example provided by xgboost. I found that the core code is as follows in rank.py.
train_dmatrix = DMatrix(x_train, y_train)
valid_dmatrix = DMatrix(x_valid, y_valid)
test_dmatrix…
giser_yugang
- 143
- 1
- 5
3
votes
2 answers
Learning to Rank Application
If there's a website/app that sells products and my job is to determine the order/ranking in which the products should be displayed.
For example : I click on restaurants and a list of restaurants pops up, I have to determine in what order the…
Nishant Arora
- 31
- 1
2
votes
0 answers
Learning to rank: how is the label calculated?
I am studying learning to rank and not sure I understand how the train sample and final label (relevance score) is constructed.
Lets assume we sell furniture online. We have logged customer's query, product customer bought, clicked.
Example:
User A…
Alina
- 143
- 1
- 7
2
votes
1 answer
What is the difference between nDCG and rank correlation methods?
When do we use one or the other?
My use case:
I want to evaluate a linear space to see how good retrieval results are. I have a set of data X (m x n) and some weights W (m x 1). I want to measure the nearest neighbour retrieval performance on W'X…
TyanTowers
- 21
- 1
2
votes
1 answer
Two definitions of DCG measure
I wanted to check the definition of Discounted Cumulative Gain (DCG) measure in the original paper Jarvelin and it seems it differs from the one given in the later literature Wang. Originally, for $n$ documents ranked from $r = 1, \ldots, p$, the…
WoofDoggy
- 343
- 1
- 2
- 11
2
votes
0 answers
Smallest possible difference between AUC of two ranker
If there are 10 positive examples, and 90 negative examples in the test set, what is the smallest possible difference in AUC, between two rankers giving different AUC?
wrek
- 123
- 3
2
votes
1 answer
How to determine the "total number of relevant documents" in calculatiion of Recall in Precision and Recall if it's not known? Can it be estimated?
On Wikipedia there is a practical example of calculating Precision and Recall:
When a search engine returns 30 pages, only 20 of which are relevant,
while failing to return 40 additional relevant pages, its precision is
20/30 = 2/3, which tells us…
Banik
- 131
- 3
2
votes
1 answer
Learning to Rank with Unlabelled Dataset
I have folder of about 60k PDF documents that I would like to learn to rank based on queries to surface the most relevant results. The goal is to surface and rank relevant documents, very much like a search engine. I understand that Learning to Rank…
amber
- 51
- 1
1
vote
0 answers
Which algorithm to use for a very simple ranking problem?
Currently, I have a dataset with 10 features that results in a ranking of 4 items, i.e. [1,2,3,4], [4,3,1,2], [3,2,4,1] or any $4!$ permutations that can arise from the ranking.
What algorithms are there to train a model for this? Making it a…
ppwc
- 11
- 1
1
vote
0 answers
How can I use a validation set to tune the hyperparameters of an XGBClassifier?
I'm currently building a ranking model using an XGBClassifier.
I have training, testing, and validation sets. I want to use the validation set to tune the hyperparameters of the XGBClassifier before using it to train a model. Is this the right…
Justin Marinelli
- 11
- 1
1
vote
1 answer
Ranking problem and imbalanced dataset
I know about the problems that imbalanced dataset will cause when we are working on classification problems. And I know the solution for that including undersampling and oversampling.
I have to work on a Ranking problem(Ranking hotels and evaluate…
Maria
- 321
- 3
- 13
1
vote
1 answer
What does "freedom" mean in the RankNet paper?
In the RankNet (Learning to Rank using Gradient Descent) paper, paragraph 3.1, it said
given the consistency requirements, how much freedom is there to choose the pairwise probabilities?
What does it mean?
Louis Law
- 11
- 1
1
vote
0 answers
Solving Feature Distribution variance between Training and Prediction for Ranking models
I am building a linear regression model to improve ranking of documents. And trying to identify problems due to which model performance estimates don't match actual impact
One major problem is feature distribution for training and…
DanMatlin
- 111
- 1
1
vote
0 answers
Listwise learning to rank with negative sample relevance
Typical listwise learning to rank (L2R) algorithm tries to learn the rank of docs $\{x_i\}_{i=1}^m$ corresponding to a query $q$. If we use correlation efficient to label the relevance between docs and query, then the label $y_i\in[0, 1]$. The…
JunjieChen
- 515
- 1
- 5
- 8