Fundamentals of Batch Normalization

In this section, we are going to discuss the steps taken to perform batch normalization.

Step 1: Compute the Mean and Variance of Mini-Batches

For mini-batch of activations [Tex]x_1,x_2,…,x_m[/Tex], the mean [Tex]μ_B[/Tex] and variance [Tex]\sigma_{B}^{2}[/Tex] of the mini-batch are computed.

Step 2: Normalization

Each activation [Tex]x_i [/Tex]is normalized using the computed mean and variance of the mini-batch.

The normalization process subtracts the mean [Tex]\mu_B[/Tex] from each activation and divides by the square root of the variance [Tex]\sigma_{B}^{2}[/Tex], ensuring that the normalized activations have a zero mean and unit variance.

Additionally, a small constant [Tex]\epsilon[/Tex] is added to the denominator for numerical stability, particularly to prevent division by zero.

[Tex]\widehat{x_i} = \frac{x_i – \mu_{B}}{\sqrt{\sigma_{B}^{2} +\epsilon}}[/Tex]

Step 3: Scale and Shift the Normalized Activations

The normalized activations [Tex]x^i[/Tex] are then scaled by a learnable parameter [Tex]\gamma[/Tex] and shifted by another learnable parameter [Tex]\beta[/Tex]. These parameters allow the model to learn the optimal scaling and shifting of the normalized activations, giving the network additional flexibility.

[Tex]y_i = \gamma \widehat{x_i} + \beta[/Tex]

What is Batch Normalization In Deep Learning?

Internal covariate shift is a major challenge encountered while training deep learning models. Batch normalization was introduced to address this issue. In this article, we are going to learn the fundamentals and need of Batch normalization. We are also going to perform batch normalization.

Table of Content

What is Batch Normalization?
Need for Batch Normalization
Fundamentals of Batch Normalization
Batch Normalization in TensorFlow
Batch Normalization in PyTorch
Benefits of Batch Normalization
Conclusion

Fundamentals of Batch Normalization

Step 1: Compute the Mean and Variance of Mini-Batches

Step 2: Normalization

Step 3: Scale and Shift the Normalized Activations

What is Batch Normalization In Deep Learning?

Similar Reads