labelled data = training data
In classification, accuracy is a commonly used metric
Accuracy : Correct Predictions / Total Observations
Splitting data -
Train/test Split -
from sklearn.model_selection
X_train, X_test, y_train, y_test = train_test_split(X,y, test_siz = 0.3,
random_state = 21, stratifiy=y)
knn = KNeighborsClassifier(n_neighbors = 6)
knn.fit(X_train, y_train)
print(knn.score(X_test,y_test))
train_accuracies = {}
test_accuracies = {}
neighbors = np.arange(1,26)
for neighbor in neighbors :
knn = KNeighborsClassifier(n_neighbors = neighbor)
knn.fit(X_train,y_train)
train_accuracies[neighbor] = knn.score(X_train, y_train)
test_accuracies[neighbor] = knn.score(X_test, y_test)
#plot Results
plt.figure(figsize=(8,6))
plt.title("Title")
plt.plot(neighbors,train_accuracies.values(), label = "Training Accuracy")
plt.plot(neighbors,train_accuracies.values(), label = "Training Accuracy")
plt.legend()
plt.xlabel("")
plt.ylabel("Accuracy")
plt.show()