Importing Dataset

The dataset which we will use here has been taken from – https://www.kaggle.com/competitions/dog-breed-identification/data. This dataset includes 10,000 images of 120 different breeds of dogs. In this data set, we have a training images folder. test image folder and a CSV file that contains information regarding the image and the breed it belongs to.

Python3




from zipfile import ZipFile
data_path = 'dog-breed-identification.zip'
  
with ZipFile(data_path, 'r') as zip:
    zip.extractall()
    print('The data set has been extracted.')


Output:

The data set has been extracted.

Python3




df = pd.read_csv('labels.csv')
df.head()


Output:

First Five rows of the dataset

Python3




df.shape


Output:

(10222, 2)

Let’s check the number of unique breeds of dog images we have in the training data.

Python3




df['breed'].nunique()


Output:

120

So, here we can see that there are 120 unique breed data which has been provided to us.

Python3




plt.figure(figsize=(10, 5))
df['breed'].value_counts().plot.bar()
plt.axis('off')
plt.show()


Output:

The number of images present in each class

Here we can observe that there is a data imbalance between the classes of different breeds of dogs.

Python3




df['filepath'] = 'train/' + df['id'] + '.jpg'
df.head()


Output:

First Five rows of the dataset

Although visualizing one image from each class is not feasible but let’s view some of them.

Python3




plt.subplots(figsize=(10, 10))
for i in range(12):
    plt.subplot(4, 3, i+1)
  
    # Selecting a random image
    # index from the dataframe.
    k = np.random.randint(0, len(df))
    img = cv2.imread(df.loc[k, 'filepath'])
    plt.imshow(img)
    plt.title(df.loc[k, 'breed'])
    plt.axis('off')
plt.show()


Output:

Sample images from the training data

The images are not of the same size which is natural as real-world images tend to be of different sizes and shapes. We will take care of this while loading and processing the images.

Python3




le = LabelEncoder()
df['breed'] = le.fit_transform(df['breed'])
df.head()


Output:

First Five rows of the dataset

Dog Breed Classification using Transfer Learning

In this article, we will learn how to build a classifier using the Transfer Learning technique which can classify among different breeds of dogs. This project has been developed using collab and the dataset has been taken from Kaggle whose link has been provided as well.

Similar Reads

Transfer Learning

In a convolutional neural network, the main task of the convolutional layers is to enhance the important features of an image. If a particular filter is used to identify the straight lines in an image then it will work for other images as well this is particularly what we do in transfer learning. There are models which are developed by researchers by regress hyperparameter tuning and training for weeks on millions of images belonging to 1000 different classes like imagenet dataset. A model that works well for one computer vision task proves to be good for others as well. Because of this reason, we leverage those trained convolutional layers parameters and tuned hyperparameters for our task to obtain higher accuracy....

Importing Libraries

Python libraries make it very easy for us to handle the data and perform typical and complex tasks with a single line of code....

Importing Dataset

...

Image Input Pipeline

The dataset which we will use here has been taken from – https://www.kaggle.com/competitions/dog-breed-identification/data. This dataset includes 10,000 images of 120 different breeds of dogs. In this data set, we have a training images folder. test image folder and a CSV file that contains information regarding the image and the breed it belongs to....

Model Development

...

Callback

...