Demonstrating PyTorch Learning Rate Scheduling

Applications of PyTorch learning rate schedulers

Colab link: Learning rate scheduler

Importing libraries

Python3

import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import torch.optim.lr_scheduler as lr_scheduler
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, TensorDataset
from sklearn.preprocessing import StandardScaler

Loading dataset

You can download the dataset from here.

Python3

df = pd.read_csv("breast-cancer.csv")
df.head()

Output:

         id diagnosis  radius_mean  texture_mean  perimeter_mean  area_mean  \
0    842302         M        17.99         10.38          122.80     1001.0   
1    842517         M        20.57         17.77          132.90     1326.0   
2  84300903         M        19.69         21.25          130.00     1203.0   
3  84348301         M        11.42         20.38           77.58      386.1   
4  84358402         M        20.29         14.34          135.10     1297.0   
   smoothness_mean  compactness_mean  concavity_mean  concave points_mean  \
0          0.11840           0.27760          0.3001              0.14710   
1          0.08474           0.07864          0.0869              0.07017   
2          0.10960           0.15990          0.1974              0.12790   
3          0.14250           0.28390          0.2414              0.10520   
4          0.10030           0.13280          0.1980              0.10430   
   ...  radius_worst  texture_worst  perimeter_worst  area_worst  \
0  ...         25.38          17.33           184.60      2019.0   
1  ...         24.99          23.41           158.80      1956.0   
2  ...         23.57          25.53           152.50      1709.0   
3  ...         14.91          26.50            98.87       567.7   
4  ...         22.54          16.67           152.20      1575.0   
   smoothness_worst  compactness_worst  concavity_worst  concave points_worst  \
0            0.1622             0.6656           0.7119                0.2654   
1            0.1238             0.1866           0.2416                0.1860   
2            0.1444             0.4245           0.4504                0.2430   
3            0.2098             0.8663           0.6869                0.2575   
4            0.1374             0.2050           0.4000                0.1625   
   symmetry_worst  fractal_dimension_worst  
0          0.4601                  0.11890  
1          0.2750                  0.08902  
2          0.3613                  0.08758  
3          0.6638                  0.17300  
4          0.2364                  0.07678  
[5 rows x 32 columns]

Data extraction and encoding

X is a DataFrame containing features, excluding the “diagnosis” and “id” columns from the original DataFrame df.
y is a Series containing the target variable “diagnosis” from the original DataFrame df.
The values in the “diagnosis” column of y are mapped to numerical values: ‘M’ (Malignant) is mapped to 1, and ‘B’ (Benign) is mapped to 0.
X represents the features, while y represents the target variable.

Python3

X = df.drop(["diagnosis", "id"],axis=1)
y= df['diagnosis']
y = y.map({'M':1, 'B':0})

Train test split and stadardisation

The train_test_split function from scikit-learn is used to split the dataset (X and y) into training and testing sets.
X_train and X_test are the training and testing sets of features, respectively.
Y_train and Y_test are the corresponding training and testing sets of target labels.
A StandardScaler instance is created, which is a preprocessing step to standardize the features.
X_train_std is obtained by fitting the scaler on X_train and then transforming it. This ensures that the training data has a mean of 0 and a standard deviation of 1.
X_test_std is standardized using the parameters learned from the training data (X_train), ensuring consistency in the scaling process.
random_state=2 is set for reproducibility. This ensures that if you run the code multiple times, you get the same train-test split.

Python3

X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.2, random_state=2)
scaler = StandardScaler()
X_train_std = scaler.fit_transform(X_train)
X_test_std = scaler.transform(X_test)

Tensor dataset and Dataloader

The NumPy arrays X_train_std and Y_train are converted to PyTorch tensors using torch.FloatTensor.
Y_train_tensor is reshaped using .view(-1, 1) to ensure it has a proper shape for model compatibility. The -1 is used to automatically infer the size based on the length of the array, and 1 indicates a single column.
Similarly, the test set features (X_test_std) and target labels (Y_test) are converted to PyTorch tensors using torch.FloatTensor. The target tensor is also reshaped.
A TensorDataset is created for the training data, combining the features (X_train_std_tensor) and targets (Y_train_tensor) into a single dataset.
DataLoader is then used to create an iterator over the dataset with a specified batch size of 32 and shuffling the data (shuffle=True).

Python3

X_train_std_tensor = torch.FloatTensor(X_train_std)
Y_train_tensor = torch.FloatTensor(Y_train.values).view(-1, 1) 
 
X_test_std_tensor = torch.FloatTensor(X_test_std)
Y_test_tensor = torch.FloatTensor(Y_test.values).view(-1, 1) 
 
train_dataset = TensorDataset(X_train_std_tensor, Y_train_tensor)
train_loader = DataLoader(dataset=train_dataset, batch_size=32, shuffle=True)

Model creation

Input Layer: 30 features.
Hidden Layers: Two hidden layers with 64 and 32 units, respectively.
Activation Functions: ReLU after each hidden layer, Sigmoid at the output.
Output Layer: Single unit for binary classification.

Python3

model = nn.Sequential(
    nn.Linear(30, 64),  # Input layer with 30 features, hidden layer with 64 units
    nn.ReLU(),
    nn.Linear(64, 32),  # Hidden layer with 32 units
    nn.ReLU(),
    nn.Linear(32, 1),   # Output layer with 1 unit (for binary classification)
    nn.Sigmoid()
)

Loss function and optimizer

criterion = nn.BCELoss(): Binary Cross Entropy Loss is chosen as the loss function, suitable for binary classification tasks.
optimizer = optim.Adam(model.parameters(), lr=0.001): Adam optimizer is used for gradient-based optimization with a learning rate of 0.001.

Python3

criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

Learning Rate Scheduler

Learning rate is adjusted using StepLR scheduler, reducing it by a factor of 0.5 every 20 epochs.

Python3

scheduler = lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)
 
num_epochs = 50

Training Loop

for epoch in range(num_epochs):: Iterating through a specified number of epochs (50 in this case).
model.train(): Sets the model to training mode.
Loop over batches from train_loader.
outputs = model(inputs): Forward pass to obtain model predictions.
targets = targets.unsqueeze(1).float(): Adjusting the shape of target tensor.
loss = criterion(outputs, targets.view(-1, 1)): Calculating the binary cross-entropy loss.
Backward pass, gradient update, and learning rate adjustment.

Python3

# Training loop
for epoch in range(num_epochs):
    model.train()
 
    for inputs, targets in train_loader:
        outputs = model(inputs)
        targets = targets.unsqueeze(1).float()  # Fix the shape of the targets
        loss = criterion(outputs, targets.view(-1, 1))
 
 
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()
 
    # Adjust learning rate
    scheduler.step()
 
    # Print loss for monitoring
    print(f'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item()}')

Output:

Epoch [1/50], Loss: 0.5196633338928223
Epoch [2/50], Loss: 0.29342177510261536
Epoch [3/50], Loss: 0.19762122631072998
Epoch [4/50], Loss: 0.19884507358074188
Epoch [5/50], Loss: 0.028389474377036095
Epoch [6/50], Loss: 0.007852290757000446
Epoch [7/50], Loss: 0.040723469108343124
Epoch [8/50], Loss: 0.04233770817518234
Epoch [9/50], Loss: 0.2953278720378876
Epoch [10/50], Loss: 0.020912442356348038

Evaluation metrics

model.eval(): Sets the model to evaluation mode.
with torch.no_grad():: Temporarily disables gradient computation during evaluation.
test_outputs = model(X_test_std_tensor): Forward pass on the test set.
test_predictions = (test_outputs >= 0.5).float(): Converting model probabilities to binary predictions using a threshold of 0.5.
accuracy = (test_predictions == Y_test_tensor).float().mean().item(): Calculating accuracy based on binary predictions and true labels.

Python3

model.eval()
with torch.no_grad():
    test_outputs = model(X_test_std_tensor)
    test_predictions = (test_outputs >= 0.5).float()  # Convert probabilities to binary predictions
 
    # Evaluation metrics (you can use appropriate metrics based on your problem)
    accuracy = (test_predictions == Y_test_tensor).float().mean().item()
    print(f'Test Accuracy: {accuracy}')

Output:

Test Accuracy: 0.9561403393745422

The provided test accuracy of approximately 95.6% suggests that the trained neural network model performs well on the test set.

Understanding PyTorch Learning Rate Scheduling

In the realm of deep learning, PyTorch stands as a beacon, illuminating the path for researchers and practitioners to traverse the complex landscapes of artificial intelligence. Its dynamic computational graph and user-friendly interface have solidified its position as a preferred framework for developing neural networks. As we delve into the nuances of model training, one essential aspect that demands meticulous attention is the learning rate. To navigate the fluctuating terrains of optimization effectively, PyTorch introduces a potent ally—the learning rate scheduler. This article aims to demystify the PyTorch learning rate scheduler, providing insights into its syntax, parameters, and indispensable role in enhancing the efficiency and efficacy of model training.

Tags:

#Geeks Premier League 2023 #Python-PyTorch #AI-ML-DS #Deep Learning #Geeks Premier League

PyTorch Learning Rate Scheduler

Applications of PyTorch learning rate schedulers

Demonstrating PyTorch Learning Rate Scheduling

Importing libraries

Python3

Loading dataset

Python3

Data extraction and encoding

Python3

Train test split and stadardisation

Python3

Tensor dataset and Dataloader

Python3

Model creation

Python3

Loss function and optimizer

Python3

Learning Rate Scheduler

Python3

Training Loop

Python3

Evaluation metrics

Python3

Understanding PyTorch Learning Rate Scheduling

Similar Reads