What is Complement Naive Bayes (CNB) Algorithm?

In this article, we will learn Complement Naive Bayes (CNB) Algorithm,This free Python tutorial for complete beginners will help you learn Python from scratch.

Complement Naive Bayes (CNB) Algorithm - ❤️Python Tutorials In 2024

this

Imbalanced datasets

How CNB works:

complement

For each class calculate the probability of the given instance not belonging to it.
After calculation for all the classes, we check all the calculated values and select the smallest value.
The smallest value (lowest probability) is selected because it is the lowest probability that it is NOT that particular class. This implies that it has the highest probability to actually belong to that class. So this class is selected.

Note:

Apples and Bananas

Sentence Number	Round	Red	Long	Yellow	Soft	Class
1	2	1	1	0	0	Apples
2	1	1	0	9	5	Bananas
3	2	1	0	0	1	Apples

Bayes’ Theorem

this

_i

Round	Red	Long	Yellow	Soft	Class
1	1	0	0	1	?

Apples

Complement

When to use CNB?

When the dataset is imbalanced: If the dataset on which classification is to be done is imbalanced, Multinomial and Gaussian Naive Bayes may give a low accuracy. However, Complement Naive Bayes will perform quite well and will give relatively higher accuracy.
For text classification tasks: Complement Naive Bayes outperforms both Gaussian Naive Bayes and Multinomial Naive Bayes in text classification tasks.

Implementation of CNB in Python:

this

Code:

# Import required modules 
from sklearn.datasets import load_wine 
from sklearn.model_selection import train_test_split 
from sklearn.metrics import accuracy_score, classification_report 
from sklearn.naive_bayes import ComplementNB 
  
# Loading the dataset  
dataset = load_wine() 
X = dataset.data 
y = dataset.target 
  
# Splitting the data into train and test sets 
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.15, random_state = 42) 
  
# Creating and training the Complement Naive Bayes Classifier 
classifier = ComplementNB() 
classifier.fit(X_train, y_train) 
  
# Evaluating the classifier 
prediction = classifier.predict(X_test) 
prediction_train = classifier.predict(X_train) 
  
print(f"Training Set Accuracy : {accuracy_score(y_train, prediction_train) * 100} %\n") 
print(f"Test Set Accuracy : {accuracy_score(y_test, prediction) * 100} % \n\n") 
print(f"Classifier Report : \n\n {classification_report(y_test, prediction)}")

OUTPUT

Training Set Accuracy : 65.56291390728477 %

Test Set Accuracy : 66.66666666666666 % 


Classifier Report : 

               precision    recall  f1-score   support

           0       0.64      1.00      0.78         9
           1       0.67      0.73      0.70        11
           2       1.00      0.14      0.25         7

    accuracy                           0.67        27
   macro avg       0.77      0.62      0.58        27
weighted avg       0.75      0.67      0.61        27

Conclusion:

References:

scikit-learn documentation.