Bounding box regression in R-CNN

Asked Apr 03 '19 at 01:15

Active Apr 03 '19 at 01:15

Viewed 915 times

In R-CNN paper, they give the definition of the target values for bounding box regression

Given that $(P, G)$ is a (prediction box, ground-truth box) pair of the form $(x, y, w, h)$ where $x, y$ is the center coordinate of the box, $w, h$ are width and height respectively.

$t_x = (G_x - P_x) / P_w \hspace{2.0cm} t_y = (G_y - P_y) / P_h$

$t_w = \log(G_w / P_w) \hspace{2.0cm} t_h = \log(G_h / P_h)$

And the goal is to find $\textbf{w}_*$, where $*$ can be $x, y, w$ or $h$, so that

$\textbf{w}_* = \arg \min_{\hat{\textbf{w}}_*} \sum_i (t^i_* - \hat{\textbf{w}}_*^T \phi(P^i))^2 + \lambda \|\hat{\textbf{w}}_*\|^2$ where $\phi(P^i)$ is the feature map given by the last pooling layer of the feature extractor after passing predicted bounding box $P^i$

I don't understand why they come up with this approach of bounding box regression ? Can anyone tell me about this ?

P.S: since this regression approach is used not only in R-CNN but also in later models, I really want to get a clear understanding of this

edited Jun 16 '20 at 11:08

Community

asked Apr 03 '19 at 01:15

HOANG GIANG

Bounding box regression in R-CNN

0 Answers0