In this blog we will compare Neural Network (NN) and Convolutional Neural Network (CNN) on Fashion MNIST dataset. Basically comparison of NN and CNN based on model convergence, model accuracy after N epoch and inference on translated images. We will also verify CNN translation invariant property.
You can read more about classification network training, transfer learning and inference in my previous blogs Image Classification in Keras and VGG-16 Inference with different image dimension.
Let's get started with this blog.
Dataset
Fashion MNIST dataset have 60,000 training images and 10,000 testing images. Each image is a 28x28 grayscale image, associated with a label from 10 classes.
One can check this google colab Notebook to follow this blog.
Import
First let's import required libraries.
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import cv2
Load Dataset
Load Fashion MNIST dataset directly from tensorflow default datasets.mnist = tf.keras.datasets.fashion_mnist
(training_images,training_labels),(test_images,test_labels)=mnist.load_data()
These are the 10 classes of Fashion MNIST.
classes =['T-shirt/top', 'Trouser', 'Pullover', 'Dress',
'Coat', 'Sandal', 'Shirt', 'Sneaker', 'Bag', 'Ankle boot' ]
After loading training and testing dataset, let's visualize few training images.
fig=plt.figure(figsize=(8, 8))
row = 3; col = 4
for i in range(row*col):
fig.add_subplot(row, col, i+1).set_title(
str(classes[training_labels[i]]))
plt.imshow(training_images[i])
plt.show()
![]() |
Fig. Sample training images (28x28x1) |
For model training we always want datasets to be normalized in -1 to 1 or 0 to 1 range. As normalize data help in model convergence. So let's normalize images by dividing them by 255.
training_images, test_images = training_images/255.0, test_images/255.0
Network Definition
def get_NN_model():
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten(input_shape=(28,28)))
model.add(tf.keras.layers.Dense(256, activation= 'relu'))
model.add(tf.keras.layers.Dense(256, activation= 'relu'))
model.add(tf.keras.layers.Dense(10, activation= 'softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
def get_CNN_model():
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(64,(3,3), input_shape=(28,28,1), padding = 'same', activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2)))
model.add(tf.keras.layers.Conv2D(128,(3,3), padding = 'same', activation = 'relu'))
model.add(tf.keras.layers.MaxPooling2D((2,2)))
model.add(tf.keras.layers.Conv2D(128,(3,3), padding = 'same', activation = 'relu'))
model.add(tf.keras.layers.GlobalAveragePooling2D())
# model.add(tf.keras.layers.Flatten())
# model.add(tf.keras.layers.Dense(256, activation = 'relu'))
model.add(tf.keras.layers.Dense(10, activation = 'softmax'))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
We will train both network for 30 epochs and then compare their accuracies, if any network reaches 99% accuracy before 30 epochs then we will stop training using below callback function.
class epoch_callback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs={}):
if(logs.get('accuracy') >= 0.99):
self.model.stop_training = True
Let's create both network and print their summary.
nn_model = get_NN_model()
print(nn_model.summary())
cnn_model = get_CNN_model()
print(cnn_model.summary())
Model Summary
Note the number of learning parameters for both the network are in same range, not much difference and using GlobalAveragePooling layer in CNN network has reduced significant number of learning parameter.
#NN model summary
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 784) 0
_________________________________________________________________
dense (Dense) (None, 256) 200960
_________________________________________________________________
dense_1 (Dense) (None, 256) 65792
_________________________________________________________________
dense_2 (Dense) (None, 10) 2570
=================================================================
Total params: 269,322
Trainable params: 269,322
Non-trainable params: 0
#CNN model summary
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 28, 28, 64) 640
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 64) 0
_________________________________________________________________
conv2d_1 (Conv2D) (None, 14, 14, 128) 73856
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 128) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 7, 7, 128) 147584
_________________________________________________________________
global_average_pooling2d (Gl (None, 128) 0
_________________________________________________________________
dense_3 (Dense) (None, 10) 1290
=================================================================
Total params: 223,370
Trainable params: 223,370
Non-trainable params: 0
Training NN & CNN
Let's train both network for 30 epochs using tensorflow fit function.
filepath = '/content/nn_model.h5'
callback = epoch_callback()
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath,
monitor='val_loss', save_best_only=True, mode='auto')
history_nn = nn_model.fit(
training_images,
training_labels,
validation_data=(test_images, test_labels),
epochs=30,
callbacks=[checkpoint,callback])
Above code will train the NN with around 95% training and 88% validation accuracy.
training_images_cnn = np.expand_dims(training_images, axis=3)
test_images_cnn = np.expand_dims(test_images, axis=3)
print(training_images_cnn.shape, test_images_cnn.shape)
Next, train CNN network for 30 epochs.filepath = '/content/cnn_model.h5'
callback = epoch_callback()
checkpoint = tf.keras.callbacks.ModelCheckpoint(filepath,
monitor='val_loss', save_best_only=True, mode='auto')
history_cnn = cnn_model.fit(
training_images_cnn,
training_labels,
validation_data=(test_images_cnn, test_labels),
epochs=30,
callbacks = [checkpoint,callback])
Plot training history for both the network.
def plot_model_history(history, title):
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title(title)
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()
plot_model_history(history_nn, 'NN')
plot_model_history(history_cnn, 'CNN')
From below plot we can see training accuracy for both network keep increasing but
validation accuracy for NN network saturate after 88% where as CNN
achieve 92% accuracy for same number of epoch. ![]() |
NN epoch vs accuracy |
![]() |
CNN epoch vs accuracy |
Model Evaluation
Evaluate best model of both the network on test data to verify accuracy and loss. nn_model.load_weights('/content/nn_model.h5')
cnn_model.load_weights('/content/cnn_model.h5')
evaluation_nn = nn_model.evaluate(test_images, test_labels)
evaluation_cnn = cnn_model.evaluate(test_images_cnn, test_labels)
print('NN loss : ', evaluation_nn[0], ' accuracy: ', evaluation_nn[1], 'on test data.')
print('CNN loss : ', evaluation_cnn[0], ' accuracy: ', evaluation_cnn[1], 'on test data.')
313/313 [==============================] - 1s 2ms/step - loss: 0.3344 - accuracy: 0.8767
313/313 [==============================] - 1s 2ms/step - loss: 0.2258 - accuracy: 0.9212
NN loss : 0.3343818485736847 accuracy: 0.8766999840736389 on test data.
CNN loss : 0.22580283880233765 accuracy: 0.9211999773979187 on test data.
Till now we have seen that CNN model converges faster than NN and achieve better training and validation accuracy for same number of epochs.Inference
Now let's see how CNN and NN performs on test images and on slightly translated and bigger images. We will create a new image 38x38 and copy original image on top corner or bottom corner of new image and then resizing it to 28x28 for inference.
def predict(model, img_nn):
pred = model.predict(img_nn)
cls_id = np.argmax(pred)
conf = pred[0][cls_id]
cls_name = classes[cls_id]
return cls_name, conf
Below is the preprocessing function for original image. It return different dimensions image for NN and CNN, as NN needs 28x28 (or 28x28x1) and CNN needs 1x28x28x1 dimension images at inference time.
def preprocess_img(img):
img_nn = np.expand_dims(img, axis=0)
img_cnn = np.expand_dims(img_nn, axis=3)
return img_nn, img_cnn
Define a function for creating new image for NN and CNN, which will be the translated form of test images.def translate_object(img):
new_img = np.zeros((38,38))
if(np.random.randint(2)):
new_img[10:38,10:38] = img
else:
new_img[0:28,0:28] = img
new_img = cv2.resize(new_img, (28,28))
plt.imshow(new_img)
plt.show()
new_img =np.expand_dims(new_img, axis=0)
img_nn = np.copy(new_img)
img_cnn =np.expand_dims(new_img, axis=3)
return img_nn, img_cnn
Finally let's process few images and it's translated version from the test dataset and see prediction from NN & CNN network.for i in range(len(test_images)):
img = test_images[i]
img_nn, img_cnn = preprocess_img(img)
plt.imshow(img)
plt.show()
print('GT-class: ', classes[test_labels[i]])
print('NN on org Image ', predict(nn_model, img_nn))
print('CNN on org Image ', predict(cnn_model, img_cnn))
img_nn, img_cnn = translate_object(img)
predict(nn_model, img_nn)
predict(cnn_model, img_cnn)
print('NN on transform Image ', predict(nn_model, img_nn))
print('CNN on transform Image ', predict(cnn_model, img_cnn))
print('\n**************************************************************\n')
if(i == 10):
break
#break
Result
Conclusion
- Convolutional neural network converges faster than Neural network and achieve better accuracy on test dataset.
- CNN perform better than NN on translated form of original image.