paper: https://arxiv.org/abs/2004.13637

Blender: 지난 2년동안 Facebook AI Research(FAIR)에서 발표한 많은 작업을 융합한 것

Introduction

1. Model

1-1. Retriever

https://s3-us-west-2.amazonaws.com/secure.notion-static.com/b3ab12bf-8e6e-471f-aaaf-3a9940a02312/Untitled.png

We employ the poly-encoder architecture of (Humeau et al., 2019). Poly-encoders encode global features of the context using multiple representations (n codes, where n is a hyperparameter), which are attended to by each possible candidate response

Idea

Bi-Encoder

input과 candidate들 각각에 self-attention이 진행되고 top layer에서 나온 ouput을 dot product로 similarity를 학습

$y_{ctxt} = red(T_1(ctxt))$

$y_{cand} = red(T_2(cand))$