How Does an RBM Work?

RBMs work by learning the probability distribution of the input data through the interactions between the visible and hidden layers. It learns through an iterative process involving two main phases: the positive phase (reconstruction) and the negative phase (learning). The goal is to adjust the weights and biases to minimize the difference between the input data and its reconstruction.

Positive Phase (Reconstruction)

  1. Data Point Input: The RBM takes a data point, represented by the activations of the visible units (input features).
  2. Hidden Layer Activation: Based on the weights and biases, the RBM activates hidden neurons using the visible input. Each hidden node [Tex] h_j[/Tex] calculates its probability of being activated given the visible layer input v. The activation probability of each hidden neuron h_j is given by:
    [Tex]P(h_j = 1 | v) = \sigma\left(b_j + \sum_i v_i w_{ij}\right)[/Tex]
    where,
    • [Tex]\sigma [/Tex] is the sigmoid function,
    • [Tex]b_j[/Tex] is the bias of the hidden node j,
    • [Tex]v_i[/Tex] is the state of the visible node i,
    • [Tex]w_{ij}[/Tex] is the weight between the visible node i and the hidden node j.
  3. Reconstruction: The RBM then reconstructs a new visible layer activation pattern by considering the activity of the hidden layer and the weights between them. The reconstructed visible [Tex]v^{‘}_{i}[/Tex] is given by:
    [Tex]P(v_i = 1 | h) = \sigma\left(a_i + \sum_j h_j w_{ij}\right)[/Tex]
    where,
    • [Tex]a_i[/Tex] is the bias of the visible node i.
  1. Sample Visible States: The visible states are then sampled from this probability distribution, yielding a reconstruction of the original input.

Negative Phase (Learning)

  1. Reconstructed Input: The reconstructed visible layer activation is then fed back through the network, activating hidden neurons based on the reconstructed data.
  2. Comparison: The RBM compares the activations of the hidden layer in this phase with the activations from the original input (positive phase). This comparison highlights the discrepancies between the input data and the RBM’s reconstruction.
  3. Weight Adjustment: Based on this comparison, the weights and biases are adjusted to minimize the difference between the original data and the reconstruction. The update rules for the weights and biases are typically based on the gradient of the reconstruction error. One common method used is Contrastive Divergence (CD), where the weight update is given by:
    [Tex]\Delta w_{ij} = \epsilon \left( \langle v_i h_j \rangle_{\text{data}} – \langle v_i h_j \rangle_{\text{recon}} \right) \\ \Delta a_i = \epsilon \left( v_i – v_i’ \right) \\ \Delta b_j = \epsilon \left( h_j – h_j’ \right)[/Tex]
    where,
    • [Tex]\epsilon[/Tex] is the learning rate,
    • [Tex]\langle \cdot \rangle_{\text{data}}[/Tex] denotes the expectation under the data distribution,
    • [Tex]\langle \cdot \rangle_{\text{recon}}[/Tex] denotes the expectation under the reconstruction distribution.

Restricted Boltzmann Machine : How it works

A Restricted Boltzmann Machine (RBM), Introduced by Geoffrey Hinton and Terry Sejnowski in 1985, Since, It become foundational in unsupervised machine learning, particularly in the context of deep learning architectures. They are widely used for dimensionality reduction, classification, regression, collaborative filtering, feature learning, and topic modelling.

Similar Reads

What is a Restricted Boltzmann Machine?

An artificial neural network that uses generative stochastic learning to learn a probability distribution across a collection of inputs is called a Restricted Boltzmann Machine (RBM). It consists of two layers of nodes: a visible layer and a hidden layer. The visible layer represents the input data, while the hidden layer captures the dependencies between the inputs. Unlike traditional Boltzmann machines, RBMs have no intra-layer connections; connections only exist between nodes in different layers. This restriction simplifies the training process and allows for more efficient learning....

Structure and Restriction of RBMs

An RBM consists of two layers of nodes:...

How Does an RBM Work?

RBMs work by learning the probability distribution of the input data through the interactions between the visible and hidden layers. It learns through an iterative process involving two main phases: the positive phase (reconstruction) and the negative phase (learning). The goal is to adjust the weights and biases to minimize the difference between the input data and its reconstruction....

Applications of RBMs

Dimensionality Reduction: RBMs can reduce the number of dimensions in the data, capturing the most relevant features.Collaborative Filtering: RBMs are used in recommendation systems to predict user preferences based on previous interactions.Feature Learning: RBMs can learn features from the input data that can be used in other machine-learning tasks.Image Recognition: RBMs can be used to pre-train layers in deep neural networks for tasks such as image recognition....

Limitations of RBMs:

Slow Training: Training RBMs can be computationally expensive, especially for large datasets.Limited Learning Capacity: Single RBMs have a limited capacity for learning complex relationships. They are often used as initial layers in DBNs to overcome this limitation....

Conclusion

Restricted Boltzmann Machines are powerful tools in the realm of unsupervised learning, capable of capturing complex dependencies in data. By understanding their structure and working mechanism, one can leverage RBMs for a variety of applications, from dimensionality reduction to feature learning and beyond. Despite their computational complexity, the ability of RBMs to model high-dimensional data distributions makes them invaluable in the field of machine learning....