DCWE - Dynamic Contextualized Word Embeddings | Notion

<aside> 💡 사회현상 분석 프로젝트를 위해 공 부했던 **DCWE’**에 대해 정리하고자 한다. 이번 포스팅은 DCWE으로 불리는 ‘Dynamic Contextualized Word Embeddings' 논문의 내용을 바탕으로 참고 자료와 함께 다시 정리한 내용이다.

</aside>

1. Introduction

전통적인 방법의 단어 임베딩 방식은 단인 벡터로 표현(정적 단어 임베딩) 방식을 사용
이러한어휘 의미론 모델링 방식은 언어적 맥락과 외적 맥락(시간적, 사회적 변화)에서 의미 변화를 담지 못함

C1 : Ignores the variability of word meaning in different linguistic contexts

언어적 맥락에 따라 달라지는 벡터로 단어를 표현함으로 해결 (contextualized word embeddings)
동음이의어와 같은 다의성 단어 의미 특성을 반영 가능
Contextualized word embeddings은 PLM의 의미 구성 backbone으로 광범위하게 활용 (ELMo, BERT, GPT, XLNet, … etc.)

C2 : Ignores the variability of word meaning in extralinguistic contexts

시간과 사회적 공간에 따른 언어 외적 맥락에 변화하는 벡터로 표현 (Dynamic Words Embedding)

2. Model

Untitled

본 논문에서는 contextualized word embedding 강점과, Dynamic Words Embedding의 유연성을 결합한 Dynamic Contextualized Word Embedding method 제안
또한 기존 Dynamic Words Embedding과 달리 시간과 사회적 공간을 함께 표현함
words는 첫째로, $d$ (dynamic type-level representations) 를 통해 사상되고 이후 $c$ (contextualized token-level representations) 를 통해 사상시킴
이는 언어 외 정보가 언어적 정보보다 먼저 처리된다는 인지과학, 언어학의 구조를 따름

$$ h^{k}=c\,(e^{k},\;E^{(<k)},\,E^{(>k)}) $$

$h^{(k)}$ : token level representation
sequence of words ($X$) : $[x^{(1)},\,\cdots\,x^{(K)}]$
non-contextualized embeddings ($E$) : $[e^{(1)},\,\cdots,\,e^{(k)}]$