주성분 분석을 통한 비지도 차원 축소

주성분 분석의 주요 단계

https://drive.google.com/uc?id=1dzul1QtBNylA7Nt7FBZLBpTGWfDo4l9f

$x = [x_{1}, x_{2}, ... , x_{d}], x \in \mathbb{R}^{d}$

$\downarrow xW, W \in \mathbb{R}^{d \times k}$

$z = [z_{1}, z_{2}, ... , z_{k}], z \in \mathbb{R}^{k}$

주성분 추출단계

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

df_wine = pd.read_csv('<https://archive.ics.uci.edu/ml/machine-learning-databases/wine/wine.data>', header=None)

X, y = df_wine.iloc[:, 1:].values, df_wine.iloc[:,0].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, stratify=y, random_state=0)

sc = StandardScaler()
X_train_std = sc.fit_transform(X_train)
X_test_std = sc.transform(X_test)

$\sigma_{jk} = {1 \over n} \sum_{i = 1}^{n} (x_{j}^{(i)} - \mu_{j})(x_{k}^{(i)} - \mu_{k})$