Started out with an amazing TensorFlow course, course-Github. Before starting the course I wanted to start with Deep Learning and went through an existential crisis of choosing the framework for practicing DeepL. And following the usual procedure of googling TF vs Pytorch and reading a bunch of medium articles and answers on Quora and stack overflow I was still not convinced enough to let go of the other. Then I read this on one of the Kaggle competitions :
People often ask a question - Keras/Tensorflow vs Pytorch?
Answer is simple - both. Many recent publications use both Keras and Pytorch. If you want to be flexible and understand how solutions work you should know both. This is why I encourage you start today ... and implement your first NN in Pytorch.
And I decided why not both? So here I go coding all the concepts from Linear Regression to Time Series both in TF and Pytorch. Link to the Pytorch implementation.
Link to all resources in the end.
Key points:
Creating the dataset:
Used the scikit learn's "make_circles" function to create the dataset of two concentric circles belonging to two separate classes. The data is non-linear and hence the neural network must be able to generalize on this non-linear data.
<aside> 💡 from sklearn.datasets import make_circles
n_samples = 1000
X, y = make_circles(n_samples, noise=0.03, random_state=42)
</aside>
import matplotlib.pyplot as plt
plt.scatter(X[:, 0], X[:, 1], c=y);
TensorFlow Implementation:
Started out with a basic model with 3 layers and using Binary Cross Entropy as the loss function for classification problems, as we can see the model didn't perform very well.
# set seed
tf.random.set_seed(42)
# create model
model = tf.keras.Sequential([
tf.keras.layers.Dense(100),
tf.keras.layers.Dense(10),
tf.keras.layers.Dense(2)
])
# compile
model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
optimizer=tf.keras.optimizers.Adam(lr=0.001),
metrics=['accuracy'])
# fit the model
history = model.fit(X, y, epochs=100, verbose=0)
So let's check the decision boundary the model predicted for the classes and we observe that the decision boundary is linear but the data is non-linear.
import numpy as np
def plot_decision_boundry(model, X, y):
x_min, x_max = X[:, 0].min() - 0.1, X[:, 0].max() + 0.1
y_min, y_max = X[:, 1].min() - 0.1, X[:, 1].max() + 0.1
xx, yy = np.meshgrid(np.linspace(x_min, x_max, 100),
np.linspace(y_min, y_max, 100))
# Create x value for making predictions
x_in = np.c_[xx.ravel(), yy.ravel()]
# Make predictions
y_pred = model.predict(x_in)
if len(y_pred[0]) > 1:
print("doing multi class classification")
y_pred = np.argmax(y_pred, axis=1).reshape(xx.shape)
else:
print("doing binary classification")
y_pred = np.round(y_pred).reshape(xx.shape)
## plot boundary
plt.contourf(xx, yy, y_pred, alpha=0.6)
plt.scatter(X[:, 0], X[:, 1], c=y, s=40, cmap=plt.cm.RdYlBu)
plt.xlim(xx.min(), xx.max())
plt.ylim(yy.min(), yy.max())
Just one change to the previous model and the model starts recognizing the non-linearity in the data ie "Adding Softmax activation to output layer" that's it and we can see the results.
# Multilayer NN with non linear inner activation and outer activation
# set seed
tf.random.set_seed(42)
# model
model_4 = tf.keras.Sequential([
tf.keras.layers.Dense(4, activation="relu"),
tf.keras.layers.Dense(4, activation="relu"),
tf.keras.layers.Dense(1, activation=tf.keras.activations.sigmoid)
])
# compile
model_4.compile(loss="binary_crossentropy",
optimizer=tf.keras.optimizers.Adam(lr=0.001),
metrics="accuracy")
# fit model
model_4.fit(X, y, epochs=250, verbose=0)