Text classification using CNN Implementation

Before implementation, you need to make sure that Python and the necessary packages are installed in your system. Install the machine learning libraries tensorflow and keras using pip. using pip for Windows and pip3 for Mac/Linux.

pip/pip3 install tensorflow
pip/pip3 install keras

Let’s first start by importing the necessary libraries including NumPy for numerical operations and Keras for building and training the neural network.

Python
# importing the necessary libraries
import numpy as np
from keras.models import Sequential
from keras.layers import Embedding, Conv1D, GlobalMaxPooling1D, Dense
from keras.preprocessing.sequence import pad_sequences
from keras.datasets import imdb
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

# Setting up the parameters
maximum_features = 5000  # Maximum number of words to consider as features
maximum_length = 100  # Maximum length of input sequences
word_embedding_dims = 50  # Dimension of word embeddings
no_of_filters = 250  # Number of filters in the convolutional layer
kernel_size = 3  # Size of the convolutional filters
hidden_dims = 250  # Number of neurons in the hidden layer
batch_size = 32  # Batch size for training
epochs = 2  # Number of training epochs
threshold = 0.5  # Threshold for binary classification

# Loading the IMDB dataset
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=maximum_features)

# Padding the sequences to ensure uniform length
x_train = pad_sequences(x_train, maxlen=maximum_length)
x_test = pad_sequences(x_test, maxlen=maximum_length)

# Building the model
model = Sequential()

# Adding the embedding layer to convert input sequences to dense vectors
model.add(Embedding(maximum_features, word_embedding_dims,
                    input_length=maximum_length))

# Adding the 1D convolutional layer with ReLU activation
model.add(Conv1D(no_of_filters, kernel_size, padding='valid',
                 activation='relu', strides=1))

# Adding the global max pooling layer to reduce dimensionality
model.add(GlobalMaxPooling1D())

# Adding the dense hidden layer with ReLU activation
model.add(Dense(hidden_dims, activation='relu'))

# Adding the output layer with sigmoid activation for binary classification
model.add(Dense(1, activation='sigmoid'))

# Compiling the model with binary cross-entropy loss and Adam optimizer
model.compile(loss='binary_crossentropy',
              optimizer='adam', metrics=['accuracy'])

# Training the model
model.fit(x_train, y_train, batch_size=batch_size,
          epochs=epochs, validation_data=(x_test, y_test))

# Predicting the probabilities for test data
y_pred_prob = model.predict(x_test)

# Converting the probabilities to binary classes based on threshold
y_pred = (y_pred_prob > threshold).astype(int)

# Calculating the evaluation metrics
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

# Printing the evaluation metrics
print('Accuracy:', accuracy)
print('Precision:', precision)
print('Recall:', recall)
print('F1-score:', f1)

Output:

Epoch 1/2
782/782 [==============================] - 7s 8ms/step - loss: 0.4245 - accuracy: 0.7927 - val_loss: 0.3713 - val_accuracy: 0.8320
Epoch 2/2
782/782 [==============================] - 7s 9ms/step - loss: 0.2521 - accuracy: 0.8971 - val_loss: 0.3251 - val_accuracy: 0.8583
782/782 [==============================] - 2s 2ms/step
Accuracy: 0.85832
Precision: 0.8426931905126244
Recall: 0.88112
F1-score: 0.8614782948768088

Text classification using CNN

Text classification is a widely used NLP task in different business problems, and using Convolution Neural Networks (CNNs) has become the most popular choice. In this article, you will learn about the basics of Convolutional neural networks and the implementation of text classification using CNNs, along with code examples. Also, you’ll learn about CNN Architecture for Text Classification, Implementation steps, use cases and applications.

Table of Content

  • Text Classification using CNN
    • CNN for Text Classification
  • Text classification using CNN Implementation
  • Use Cases and Applications
  • Challenges and Considerations
  • Future Directions
  • Conclusion

Similar Reads

Text Classification using CNN

Text classification is the process of categorizing unstructured text into predefined classes or categories using Natural Language Processing(NLP). Text classification is also called as text categorization or text tagging. Some of the Text classification examples include Sentiment Analysis, Spam Detection, News Articles Classification, Topic Detection, and Language Detection. Originally, CNNs were designed and developed for Image classification-related tasks....

Text classification using CNN Implementation

Before implementation, you need to make sure that Python and the necessary packages are installed in your system. Install the machine learning libraries tensorflow and keras using pip. using pip for Windows and pip3 for Mac/Linux....

Use Cases and Applications

There are a lot of applications in Text Classification using CNNs, Some of them include...

Challenges and Considerations

We need to take care to not overfit the data and we need to use various regularizaiton methods for that to happen. Apart from the challenge of Overfitting the data, there are various other challenges and considerations like Data Quality, Class Imbalance, and Model Interpretability....

Future Directions

We can improve our CNN model by adding more layers, and it is always preferred to have more dense layers instead of less wide layers. Future research directions in the text classification using CNNs include Attention Mechanisms, Multi-task learning, and Transfer learning....

Conclusion

Using Convolution Neural Networks (CNNs) for text classification is a powerful approach for classification of text and in this article we have explored everything like Text Classification, CNN Architecture for Text Classification, Implementation Steps, Use Cases and Applications, Performance Evaluation, Challenges and Considerations, Future Directions, Since Text Classification using CNNs is not an easy topic, Feel free to read the article again for better understanding....