Illustration 2

Build a handwritten digit classifications model using a custom optimizer

Step 1:

Import the necessary libraries

Python3

import torch 
import torch.nn as nn 
from torch.optim import Optimizer 
from torch.utils.data import DataLoader 
from torchvision.datasets import MNIST 
from torchvision.transforms import ToTensor 
from torch.utils.tensorboard import SummaryWriter 
import math 
import matplotlib.pyplot as plt 
  
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

Step 2:

Now, we’ll load the MNIST dataset, and create a data loader for it.

Python3

# Loading the dataset 
dataset = MNIST(root='.', train=True, download=True, transform=ToTensor()) 
dataloader = DataLoader(dataset, batch_size=32, shuffle=True) 
dataloader.dataset

Output:

Dataset MNIST
    Number of datapoints: 60000
    Root location: .
    Split: Train
    StandardTransform
Transform: ToTensor()

Step 3:

Let’s visualize the first batch of our dataset.

Python3

sample_idx = torch.randint(len(dataloader), size=(1,)).item() 
len(dataloader) 
for i, batch in enumerate(dataloader): 
    figure = plt.figure(figsize=(16, 16)) 
    img, label = batch 
    for j in range(img.shape[0]): 
        figure.add_subplot(8, 8, j+1) 
        plt.imshow(img[j].squeeze(), cmap="gray") 
        plt.title(label[j]) 
        plt.axis("off") 
          
    plt.show() 
    break

Output:

First batch input images

Step 4:

Next, we’ll define our model architecture, a simple fully connected network with two hidden layers

Python3

class Net(nn.Module): 
    def __init__(self): 
        super(Net, self).__init__() 
        self.fc1 = nn.Linear(28*28, 512) 
        self.fc2 = nn.Linear(512, 512) 
        self.fc3 = nn.Linear(512, 10) 
  
    def forward(self, x): 
        x = x.view(-1, 28*28) 
        x = torch.relu(self.fc1(x)) 
        x = torch.relu(self.fc2(x)) 
        x = self.fc3(x) 
        return x 
        
# Model 
model = Net().to(device)

Step 4:

we’ll define our loss function, in this case, we’ll use the cross-entropy loss.

Python3

# Loss functions 
loss_fn = nn.CrossEntropyLoss()

Step 5:

Next, we’ll define our custom optimizer

Python3

# Define custom optimizer 
class MyAdam(torch.optim.Adam): 
    def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), weight_decay=0): 
        super().__init__(params, lr=lr, betas=betas) 
        self.weight_decay = weight_decay 
  
    def step(self): 
        for group in self.param_groups: 
            for p in group['params']: 
                if p.grad is None: 
                    continue
                grad = p.grad.data 
                if grad.is_sparse: 
                    raise RuntimeError("Adam does not support sparse gradients") 
  
                state = self.state[p] 
  
                # State initialization 
                if len(state) == 0: 
                    state["step"] = 0
                    # Exponential moving average of gradient values 
                    state["exp_avg"] = torch.zeros_like(p.data) 
                    # Exponential moving average of squared gradient values 
                    state["exp_avg_sq"] = torch.zeros_like(p.data) 
  
                exp_avg, exp_avg_sq = state["exp_avg"], state["exp_avg_sq"] 
                beta1, beta2 = group["betas"] 
  
                state["step"] += 1
  
                if self.weight_decay != 0: 
                    grad = grad.add(p.data, alpha=self.weight_decay) 
  
                # Decay the first and second moment running average coefficient 
                exp_avg.mul_(beta1).add_(1 - beta1, grad) 
                exp_avg_sq.mul_(beta2).addcmul_(1 - beta2, grad, grad) 
  
                denom = exp_avg_sq.sqrt().add_(group["eps"]) 
  
                bias_correction1 = 1 - beta1 ** state["step"] 
                bias_correction2 = 1 - beta2 ** state["step"] 
                step_size = group["lr"] * math.sqrt(bias_correction2) / bias_correction1 
  
                p.data.addcdiv_(-step_size, exp_avg, denom) 
  
# Optimizer 
optimizer = MyAdam(model.parameters(), weight_decay=0.00001)

Step 6:

Now, Train the model with custom optimizer and Plot the training loss.

Python3

# Training loop 
num_epochs = 10
for i in range(num_epochs): 
    for inputs, labels in dataloader: 
        inputs, labels = inputs.to(device), labels.to(device) 
        outputs = model(inputs) 
        loss = loss_fn(outputs, labels) 
  
        optimizer.zero_grad() 
        loss.backward() 
        optimizer.step() 
        #scheduler.step() 
          
    plt.plot(i,loss.item(),'ro-') 
    print(i,'>> Loss :', loss.item()) 
  
plt.title('Losses over iterations') 
plt.xlabel('iterations') 
plt.ylabel('Losses') 
plt.show()

Output:

0 >> Loss : nan
1 >> Loss : 1.2611686178923354e-44
2 >> Loss : nan
3 >> Loss : 8.407790785948902e-45
4 >> Loss : nan
5 >> Loss : 1.401298464324817e-45
6 >> Loss : nan
7 >> Loss : 0.0
8 >> Loss : nan
9 >> Loss : 1.401298464324817e-45

Losses

Note: Losses will be different for different devices.

Custom Optimizers in Pytorch

In PyTorch, an optimizer is a specific implementation of the optimization algorithm that is used to update the parameters of a neural network. The optimizer updates the parameters in such a way that the loss of the neural network is minimized. PyTorch provides various built-in optimizers such as SGD, Adam, Adagrad, etc. that can be used out of the box. However, in some cases, the built-in optimizers may not be suitable for a particular problem or may not perform well. In such cases, one can create their own custom optimizer.

A custom optimizer in PyTorch is a class that inherits from the torch.optim.Optimizer base class. The custom optimizer should implement the init and step methods. The init method is used to initialize the optimizer’s internal state, and the step method is used to update the parameters of the model.

Tags:

#Python-PyTorch #Technical Scripter 2022 #AI-ML-DS #Machine Learning #Technical Scripter #Machine Learning

Customizing Optimizers:

Conclusion:

Illustration 2

Step 1:

Python3

Step 2:

Python3

Step 3:

Python3

Step 4:

Python3

Step 4:

Python3

Step 5:

Python3

Step 6:

Python3

Custom Optimizers in Pytorch

Similar Reads

Contact Us