predicted objects와 ground truth objects를 bipartite matching → object specific loss를 최적화
$$ \begin{aligned} \hat{\sigma}=\argmin_{\sigma\in\mathfrak{S}N}\sum^N_i\mathcal{L}{\text{match}}(y_i,\hat{y}_{\sigma(i)}) &(1) \end{aligned} $$
$$ \begin{aligned} \mathcal{L}\text{Hungrian}(y,\hat{y})=\sum^N{i=1}[-\log\hat{p}{\hat{\sigma}(i)}(c_i)+\mathbb{1}{c_i\neq\varnothing}\mathcal{L}\text{box}(b_i,\hat{b}{\hat{\sigma}}(i))] &(2) \end{aligned} $$
a linear combination of the $l_1$ loss and the generalized IoU loss $\mathcal{L}_\text{iou}(\cdot,\cdot)$ that is scale-invariant
$$ \mathcal{L}\text{box}(b_i,\hat{b}{\sigma(i)})=\lambda_\text{iou}\mathcal{L}\text{iou}(b_i,\hat{b}{\sigma(i)})+\lambda_{\text{L}1}||b_i-\hat{b}_{\sigma(i)}||_1 \tag{3} $$