How Does Batch Normalization Work in CNN?
Batch normalization works in convolutional neural networks (CNNs) by normalizing the activations of each layer across mini-batch during training. The working is discussed below:
1. Normalization within Mini-Batch
In a CNN, each layer receives inputs from multiple channels (feature maps) and processes them through convolutional filters. Batch Normalization operates on each feature map separately, normalizing the activations across the mini-batch.
During training, batch normalization (BN) standardizes the activations of each layer by subtracting the mean and dividing by the standard deviation of each mini-batch.
- Mean Calculation:
- Variance Calculation:
- Normalization:
2. Scaling and Shifting
After normalization, BN adjusts the normalized activations using learned scaling and shifting parameters. These parameters enable the network to adaptively scale and shift the activations, thereby maintaining the networkâs ability to represent complex patterns in the data.
- Scaling:
- Shifting:
3. Learnable Parameters
The parameters and are learned during training through backpropagation. This allows the network to adaptively adjust the normalization and ensure that the activations are in the appropriate range for learning.
4. Applying Batch Normalization
Batch Normalization is typically applied after the convolutional and activation layers in a CNN, before passing the outputs to the next layer. It can also be applied before or after the activation function, depending on the network architecture.
5. Training and Inference
During training, Batch Normalization calculates the mean and variance of each mini-batch. During inference (testing), it uses the aggregated mean and variance calculated during training to normalize the activations. This ensures consistent normalization between training and inference.
What is Batch Normalization in CNN?
Batch Normalization is a technique used to improve the training and performance of neural networks, particularly CNNs. The article aims to provide an overview of batch normalization in CNNs along with the implementation in PyTorch and TensorFlow.
Table of Content
- Overview of Batch Normalization
- Need for Batch Normalization in CNN model
- How Does Batch Normalization Work in CNN?
- 1. Normalization within Mini-Batch
- 2. Scaling and Shifting
- 3. Learnable Parameters
- 4. Applying Batch Normalization
- 5. Training and Inference
- Applying Batch Normalization in CNN model using TensorFlow
- Applying Batch Normalization in CNN model using PyTorch
- Advantages of Batch Normalization in CNN