Colab link: Learning rate scheduler
Importing libraries
Python3
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
import torch.optim.lr_scheduler as lr_scheduler
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from torch.utils.data import DataLoader, TensorDataset
from sklearn.preprocessing import StandardScaler
|
Loading dataset
You can download the dataset from here.
Python3
df = pd.read_csv( "breast-cancer.csv" )
df.head()
|
id diagnosis radius_mean texture_mean perimeter_mean area_mean \
0 842302 M 17.99 10.38 122.80 1001.0
1 842517 M 20.57 17.77 132.90 1326.0
2 84300903 M 19.69 21.25 130.00 1203.0
3 84348301 M 11.42 20.38 77.58 386.1
4 84358402 M 20.29 14.34 135.10 1297.0
smoothness_mean compactness_mean concavity_mean concave points_mean \
0 0.11840 0.27760 0.3001 0.14710
1 0.08474 0.07864 0.0869 0.07017
2 0.10960 0.15990 0.1974 0.12790
3 0.14250 0.28390 0.2414 0.10520
4 0.10030 0.13280 0.1980 0.10430
... radius_worst texture_worst perimeter_worst area_worst \
0 ... 25.38 17.33 184.60 2019.0
1 ... 24.99 23.41 158.80 1956.0
2 ... 23.57 25.53 152.50 1709.0
3 ... 14.91 26.50 98.87 567.7
4 ... 22.54 16.67 152.20 1575.0
smoothness_worst compactness_worst concavity_worst concave points_worst \
0 0.1622 0.6656 0.7119 0.2654
1 0.1238 0.1866 0.2416 0.1860
2 0.1444 0.4245 0.4504 0.2430
3 0.2098 0.8663 0.6869 0.2575
4 0.1374 0.2050 0.4000 0.1625
symmetry_worst fractal_dimension_worst
0 0.4601 0.11890
1 0.2750 0.08902
2 0.3613 0.08758
3 0.6638 0.17300
4 0.2364 0.07678
[5 rows x 32 columns]
Data extraction and encoding
- X is a DataFrame containing features, excluding the âdiagnosisâ and âidâ columns from the original DataFrame df.
- y is a Series containing the target variable âdiagnosisâ from the original DataFrame df.
- The values in the âdiagnosisâ column of y are mapped to numerical values: âMâ (Malignant) is mapped to 1, and âBâ (Benign) is mapped to 0.
- X represents the features, while y represents the target variable.
Python3
X = df.drop([ "diagnosis" , "id" ],axis = 1 )
y = df[ 'diagnosis' ]
y = y. map ({ 'M' : 1 , 'B' : 0 })
|
Train test split and stadardisation
- The train_test_split function from scikit-learn is used to split the dataset (X and y) into training and testing sets.
- X_train and X_test are the training and testing sets of features, respectively.
- Y_train and Y_test are the corresponding training and testing sets of target labels.
- A StandardScaler instance is created, which is a preprocessing step to standardize the features.
- X_train_std is obtained by fitting the scaler on X_train and then transforming it. This ensures that the training data has a mean of 0 and a standard deviation of 1.
- X_test_std is standardized using the parameters learned from the training data (X_train), ensuring consistency in the scaling process.
- random_state=2 is set for reproducibility. This ensures that if you run the code multiple times, you get the same train-test split.
Python3
X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size = 0.2 , random_state = 2 )
scaler = StandardScaler()
X_train_std = scaler.fit_transform(X_train)
X_test_std = scaler.transform(X_test)
|
Tensor dataset and Dataloader
- The NumPy arrays X_train_std and Y_train are converted to PyTorch tensors using torch.FloatTensor.
- Y_train_tensor is reshaped using .view(-1, 1) to ensure it has a proper shape for model compatibility. The -1 is used to automatically infer the size based on the length of the array, and 1 indicates a single column.
- Similarly, the test set features (X_test_std) and target labels (Y_test) are converted to PyTorch tensors using torch.FloatTensor. The target tensor is also reshaped.
- A TensorDataset is created for the training data, combining the features (X_train_std_tensor) and targets (Y_train_tensor) into a single dataset.
- DataLoader is then used to create an iterator over the dataset with a specified batch size of 32 and shuffling the data (shuffle=True).
Python3
X_train_std_tensor = torch.FloatTensor(X_train_std)
Y_train_tensor = torch.FloatTensor(Y_train.values).view( - 1 , 1 )
X_test_std_tensor = torch.FloatTensor(X_test_std)
Y_test_tensor = torch.FloatTensor(Y_test.values).view( - 1 , 1 )
train_dataset = TensorDataset(X_train_std_tensor, Y_train_tensor)
train_loader = DataLoader(dataset = train_dataset, batch_size = 32 , shuffle = True )
|
Model creation
- Input Layer: 30 features.
- Hidden Layers: Two hidden layers with 64 and 32 units, respectively.
- Activation Functions: ReLU after each hidden layer, Sigmoid at the output.
- Output Layer: Single unit for binary classification.
Python3
model = nn.Sequential(
nn.Linear( 30 , 64 ),
nn.ReLU(),
nn.Linear( 64 , 32 ),
nn.ReLU(),
nn.Linear( 32 , 1 ),
nn.Sigmoid()
)
|
Loss function and optimizer
- criterion = nn.BCELoss(): Binary Cross Entropy Loss is chosen as the loss function, suitable for binary classification tasks.
- optimizer = optim.Adam(model.parameters(), lr=0.001): Adam optimizer is used for gradient-based optimization with a learning rate of 0.001.
Python3
criterion = nn.BCELoss()
optimizer = optim.Adam(model.parameters(), lr = 0.001 )
|
Learning Rate Scheduler
- Learning rate is adjusted using StepLR scheduler, reducing it by a factor of 0.5 every 20 epochs.
Python3
scheduler = lr_scheduler.StepLR(optimizer, step_size = 20 , gamma = 0.5 )
num_epochs = 50
|
Training Loop
- for epoch in range(num_epochs):: Iterating through a specified number of epochs (50 in this case).
- model.train(): Sets the model to training mode.
- Loop over batches from train_loader.
- outputs = model(inputs): Forward pass to obtain model predictions.
- targets = targets.unsqueeze(1).float(): Adjusting the shape of target tensor.
- loss = criterion(outputs, targets.view(-1, 1)): Calculating the binary cross-entropy loss.
- Backward pass, gradient update, and learning rate adjustment.
Python3
for epoch in range (num_epochs):
model.train()
for inputs, targets in train_loader:
outputs = model(inputs)
targets = targets.unsqueeze( 1 ). float ()
loss = criterion(outputs, targets.view( - 1 , 1 ))
optimizer.zero_grad()
loss.backward()
optimizer.step()
scheduler.step()
print (f 'Epoch [{epoch + 1}/{num_epochs}], Loss: {loss.item()}' )
|
Epoch [1/50], Loss: 0.5196633338928223
Epoch [2/50], Loss: 0.29342177510261536
Epoch [3/50], Loss: 0.19762122631072998
Epoch [4/50], Loss: 0.19884507358074188
Epoch [5/50], Loss: 0.028389474377036095
Epoch [6/50], Loss: 0.007852290757000446
Epoch [7/50], Loss: 0.040723469108343124
Epoch [8/50], Loss: 0.04233770817518234
Epoch [9/50], Loss: 0.2953278720378876
Epoch [10/50], Loss: 0.020912442356348038
Evaluation metrics
- model.eval(): Sets the model to evaluation mode.
- with torch.no_grad():: Temporarily disables gradient computation during evaluation.
- test_outputs = model(X_test_std_tensor): Forward pass on the test set.
- test_predictions = (test_outputs >= 0.5).float(): Converting model probabilities to binary predictions using a threshold of 0.5.
- accuracy = (test_predictions == Y_test_tensor).float().mean().item(): Calculating accuracy based on binary predictions and true labels.
Python3
model. eval ()
with torch.no_grad():
test_outputs = model(X_test_std_tensor)
test_predictions = (test_outputs > = 0.5 ). float ()
accuracy = (test_predictions = = Y_test_tensor). float ().mean().item()
print (f 'Test Accuracy: {accuracy}' )
|
Test Accuracy: 0.9561403393745422
The provided test accuracy of approximately 95.6% suggests that the trained neural network model performs well on the test set.
Understanding PyTorch Learning Rate Scheduling
In the realm of deep learning, PyTorch stands as a beacon, illuminating the path for researchers and practitioners to traverse the complex landscapes of artificial intelligence. Its dynamic computational graph and user-friendly interface have solidified its position as a preferred framework for developing neural networks. As we delve into the nuances of model training, one essential aspect that demands meticulous attention is the learning rate. To navigate the fluctuating terrains of optimization effectively, PyTorch introduces a potent allyâthe learning rate scheduler. This article aims to demystify the PyTorch learning rate scheduler, providing insights into its syntax, parameters, and indispensable role in enhancing the efficiency and efficacy of model training.