How Does Batch Normalization Work in CNN?

Need for Batch Normalization in CNN model

Applying Batch Normalization in CNN model using TensorFlow

Batch normalization works in convolutional neural networks (CNNs) by normalizing the activations of each layer across mini-batch during training. The working is discussed below:

1. Normalization within Mini-Batch

In a CNN, each layer receives inputs from multiple channels (feature maps) and processes them through convolutional filters. Batch Normalization operates on each feature map separately, normalizing the activations across the mini-batch.

During training, batch normalization (BN) standardizes the activations of each layer by subtracting the mean and dividing by the standard deviation of each mini-batch.

Mean Calculation:
Variance Calculation:
Normalization:

2. Scaling and Shifting

After normalization, BN adjusts the normalized activations using learned scaling and shifting parameters. These parameters enable the network to adaptively scale and shift the activations, thereby maintaining the network’s ability to represent complex patterns in the data.

Scaling:
Shifting:

3. Learnable Parameters

The parameters and are learned during training through backpropagation. This allows the network to adaptively adjust the normalization and ensure that the activations are in the appropriate range for learning.

4. Applying Batch Normalization

Batch Normalization is typically applied after the convolutional and activation layers in a CNN, before passing the outputs to the next layer. It can also be applied before or after the activation function, depending on the network architecture.

5. Training and Inference

During training, Batch Normalization calculates the mean and variance of each mini-batch. During inference (testing), it uses the aggregated mean and variance calculated during training to normalize the activations. This ensures consistent normalization between training and inference.

What is Batch Normalization in CNN?

Batch Normalization is a technique used to improve the training and performance of neural networks, particularly CNNs. The article aims to provide an overview of batch normalization in CNNs along with the implementation in PyTorch and TensorFlow.

Table of Content

Overview of Batch Normalization
Need for Batch Normalization in CNN model
How Does Batch Normalization Work in CNN?

1. Normalization within Mini-Batch
2. Scaling and Shifting
3. Learnable Parameters
4. Applying Batch Normalization
5. Training and Inference

Applying Batch Normalization in CNN model using TensorFlow
Applying Batch Normalization in CNN model using PyTorch
Advantages of Batch Normalization in CNN