1. Introduction

2. Related work

3. The DETR model

3.1 Object detection set prediction loss

Matching loss

Bounding box loss

3.2 DETR architecture

Backbone

Transformer encoder