Semi-supervised Classification with Graph Convolutional Networks :: GCN

<aside> 📌 프로젝트를 위해 공부 했던 'GCN' 에 대해 정리하고자 한다. 이번 포스팅에서는 'GCN'을 처음 소개했던 'Semi-supervised Classification with Graph Convolutional Networks' 논문의 내용을 바탕으로 참고 자료와 함께 다시 정리하였다.

</aside>

0. Preliminaries

Convolution
Convolution on Graph
Convolution theorem (푸리에 변환과 합성곱의 관계 정리)

1. Introduction

GCN(Graph Convolutional Networks)는 graph와 그래프의 몇몇 node에 주어진 레이블을 이용하여 나머지 node의 label을 예측하는 node classification(semi-supervised classification) task를 품

(input으로는 전체 node feature와 adjacency matrix를 사용하고, train loss 계산에서는 labeled data, test loss에서는 unlabeled data를 사용)
Convolution 구조를 차용한 이유
Semi-supervised classification

2. Fast Approximate Convolutions on Graphs

본 논문에서는 Graph와 Adjacency matrix를 input으로 이용하는 GCN($f(X,A)$) 제시
다음과 같은 Graph Convolution 연산을 여러 번 진행
- $H^{(l+1)}$ = $\sigma$($\tilde D^{-\frac{1}{2}}\tilde A \tilde D^{-\frac{1}{2}}H^{(l)}W^{(l)}$)
- 해당식이 first-order approximation for localized spectral filters on graph인 것을 유도

3. Semi-Supervised Node Classification

two-layer GCN을 사용하는 semi-supervised node classification task를 예로 들면

$$ Z = f(X,A) = softmax(\hat{A}ReLU(\hat{A}XW^{(0)})W^{1}) $$

$$ \mathcal{L} = -\sum_{l\in\mathcal{V}L}\sum^F{f= 1}Y_{lf}lnZ_{lf} $$

$\hat{A} = \tilde{D}^{-\frac{1}{2}}\tilde{A}\tilde{D}^{-\frac{1}{2}}$
$W^{(0)}\in \mathbb{R}^{C \times H}$ : input-to-hidden weight matrix for a hidden layer with H feature maps
$W^{(1)}\in \mathbb{R}^{H \times F}$ : hidden-to-output weight matrix

→ using the full dataset for every training fiteration (full-batch)

(mini-batch를 사용 하는 것은 향후 과제로 .. )

( ** Graph Convoluitonal Networks for Hyperspectral Image Classification 논문에서 miniGCN (mini-batch GCN)을 제안 )

0. Preliminaries

1. Introduction

2. Fast Approximate Convolutions on Graphs

3. Semi-Supervised Node Classification

4. Experiments