Model training

Step 1: Creating a CNN architecture

We will create a basic CNN architecture from scratch to classify the images. We will be using 3 convolution layers along with 3 max-pooling layers. At last, we will add a softmax layer of 10 nodes as we have 10 labels to be identified.

Python3




def model_arch():
    models = Sequential()
 
    # We are learning 64
    # filters with a kernal size of 5x5
    models.add(Conv2D(64, (5, 5),
                      padding="same",
                      activation="relu",
                      input_shape=(28, 28, 1)))
 
    # Max pooling will reduce the
    # size with a kernal size of 2x2
    models.add(MaxPooling2D(pool_size=(2, 2)))
    models.add(Conv2D(128, (5, 5), padding="same",
                      activation="relu"))
 
    models.add(MaxPooling2D(pool_size=(2, 2)))
    models.add(Conv2D(256, (5, 5), padding="same",
                      activation="relu"))
 
    models.add(MaxPooling2D(pool_size=(2, 2)))
 
    # Once the convolutional and pooling
    # operations are done the layer
    # is flattened and fully connected layers
    # are added
    models.add(Flatten())
    models.add(Dense(256, activation="relu"))
 
    # Finally as there are total 10
    # classes to be added a FCC layer of
    # 10 is created with a softmax activation
    # function
    models.add(Dense(10, activation="softmax"))
    return models


Now we will see the model summary. To do that, we will first compile our model and set out loss to sparse categorical crossentropy and metrics as sparse categorical accuracy.

Python3




model = model_arch()
 
model.compile(optimizer=Adam(learning_rate=1e-3),
              loss='sparse_categorical_crossentropy',
              metrics=['sparse_categorical_accuracy'])
 
model.summary()


Model summary

Step 2: Train the data on the model

As we have compiled the model, we will now train our model. To do this, we will use mode.fit() function and set the epochs to 10. We will also perform a validation split of 33% to get better test accuracy and have a minimum loss.

Python3




history = model.fit(
    trainX.astype(np.float32), trainy.astype(np.float32),
    epochs=10,
    steps_per_epoch=100,
    validation_split=0.33
)


Step 3: Save the model

We will now save the model in the .h5 format so it can be bundled with any web framework or any other development domain.

Python3




model.save_weights('./model.h5', overwrite=True)


Step 4: Plotting the training and loss functions

Training and loss functions are important functions in any ML project. they tell us how well the model performs under how many epochs and how much time the model takes actually to converge.

Python3




# Accuracy vs Epoch plot
plt.plot(history.history['sparse_categorical_accuracy'])
plt.plot(history.history['val_sparse_categorical_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()


Output:

Python3




# Loss vs Epoch plot
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Accuracy')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()


Output:

Build the Model for Fashion MNIST dataset Using TensorFlow in Python

The primary objective will be to build a classification model which will be able to identify the different categories of the fashion industry from the Fashion MNIST dataset using Tensorflow and Keras

To complete our objective, we will create a CNN model to identify the image categories and train it on the dataset. We are using deep learning as a method of choice since the dataset consists of images, and CNN’s have been the choice of algorithm for image classification tasks. We will use Keras to create CNN and Tensorflow for data manipulation tasks.

The task will be divided into three steps data analysis, model training and prediction. Let us start with data analysis.

Similar Reads

Data Analysis

Step 1: Importing the required libraries...

Model training

...

Prediction

...