k-평균 알고리즘을 사용하여 유사한 객체 그룹핑

사이킷런을 사용한 k-평균 군집

from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt

X, y = make_blobs(n_samples=150, n_features=2, centers=3, cluster_std=0.5, shuffle=True, random_state=0)

plt.scatter(X[:,0], X[:,1], c='white', marker='o', edgecolor='black', s=50)
plt.grid()
plt.tight_layout()
plt.show()

https://drive.google.com/uc?id=1pBjbxE_DqmZvdSF3h563BvFcWLQUT9-b

$d(x, y)^{2} = \sum_{j=1}^{m} (x_{j} - y_{j})^{2} = \| x- y \|_{2}^{2}$

$SSE = \sum_{i=1}^{n} \sum_{j=1}^{k} w^{(i, j)} \| x^{(i)} - \mu^{(j)} \|_{2}^{2}$

from sklearn.cluster import KMeans

km = KMeans(n_clusters=3, init='random', n_init=10, max_iter=300, tol=1e-04, random_state=0)

y_km = km.fit_predict(X)