Visualizing Training Progress Using TensorBoard

Visualizing Training Progress in PyTorch Using Matplotlib

In order to visualize the training process in a deep learning model, we can use SummaryWriter class from torch.utils.tensorboard module, which seamlessly integrates with TensorBoard, a visualization tool developed by TensorFlow.

Integration: PyTorch provides a SummaryWriter class in the torch.utils.tensorboard module, which integrates seamlessly with TensorBoard for visualization.
Logging: Inside the training loop, you can use SummaryWriter to log various metrics like loss, accuracy, etc., for visualization.
Visualization: TensorBoard provides interactive and real-time visualizations of the logged metrics, allowing you to monitor the training progress dynamically.
Monitoring: TensorBoard enables you to monitor multiple aspects of training, such as learning curves, model graphs, and histogram of weights, providing insights for optimizing your model.

Install the TensorBoard library using the following command:

pip install tensorboard

Step 1: Import Libraries

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

Step 2: Define a Simple Neural Network

Lets define SimpleNN a class declaration of a simple neural network containing two layers that are fully connected, along with forward function that defines the forward pass of the network.

# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.flatten(x, 1)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

Step 3: Load MNIST Dataset

Let us load the MINST data for training, divide it into batches and apply transformation by using some preprocessing techniques.

# Load a smaller subset of MNIST dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)
small_train_dataset = torch.utils.data.Subset(train_dataset, range(1000))  # Subset of first 1000 samples
train_loader = DataLoader(small_train_dataset, batch_size=64, shuffle=True)

Step 4: Initialize Model, Loss Function, and Optimizer

Now, initialize model. Along with it we will be using cross-entropy loss function and adam optimizer for updating model parameters.

# Initialize model, loss function, and optimizer
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

Step 5: Initialize SummaryWriter for Logging

SummaryWriter is object of imported module to write logs to be visualized in TensorBoard.

# Initialize SummaryWriter for logging
writer = SummaryWriter('logs_small')

Step 6: Training Loop

Training Loop: Goes through the epochs and batches, performs forward pass, computes loss, backward pass and updates model parameters.
Log Loss and Accuracy: The epoch-wise training loss and accuracy are being logged.

# Training loop
epochs = 5
for epoch in range(epochs):
    running_loss = 0.0
    correct = 0
    total = 0

    for i, (inputs, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        # Calculate accuracy
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

        # Log loss
        writer.add_scalar('Loss/train', loss.item(), epoch * len(train_loader) + i)

    # Log accuracy
    accuracy = 100 * correct / total
    writer.add_scalar('Accuracy/train', accuracy, epoch)

    print(f'Epoch [{epoch+1}/{epochs}], Loss: {running_loss / len(train_loader)}, Accuracy: {accuracy}%')

print('Finished Training')
writer.close()

Complete Code:

Python3

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader
from torch.utils.tensorboard import SummaryWriter

# Define a simple neural network
class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.flatten(x, 1)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x
      
# Load a smaller subset of MNIST dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=transform, download=True)
small_train_dataset = torch.utils.data.Subset(train_dataset, range(1000))  # Subset of first 1000 samples
train_loader = DataLoader(small_train_dataset, batch_size=64, shuffle=True)

# Initialize model, loss function, and optimizer
model = SimpleNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Initialize SummaryWriter for logging
writer = SummaryWriter('logs_small')


# Training loop
epochs = 5
for epoch in range(epochs):
    running_loss = 0.0
    correct = 0
    total = 0

    for i, (inputs, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

        # Calculate accuracy
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

        # Log loss
        writer.add_scalar('Loss/train', loss.item(), epoch * len(train_loader) + i)

    # Log accuracy
    accuracy = 100 * correct / total
    writer.add_scalar('Accuracy/train', accuracy, epoch)

    print(f'Epoch [{epoch+1}/{epochs}], Loss: {running_loss / len(train_loader)}, Accuracy: {accuracy}%')

print('Finished Training')
writer.close()

Output:

Epoch [1/5], Loss: 1.9477481991052628, Accuracy: 37.9%
Epoch [2/5], Loss: 1.207241453230381, Accuracy: 73.4%
Epoch [3/5], Loss: 0.8120987266302109, Accuracy: 80.9%
Epoch [4/5], Loss: 0.6294657941907644, Accuracy: 84.7%
Epoch [5/5], Loss: 0.5223450381308794, Accuracy: 86.5%
Finished Training

Visualizing Training Progress in PyTorch

In order to run TensorBoard, you should open a terminal and go to the folder in which you keep your logs (stored in previous step); then to run tensorboard use command:

tensorboard --logdir=logs

Output:

Visualizing accuracy During training.

Loss during training at each step.

Accessing TensorBoard requires : Opening a browser and enter into the web address given by TensorBoard (normally http://localhost:6006/).

TensorBoard provides a web-based dashboard with tabs and visualizations representing various training aspects. Scalar metrics visualize values like loss or accuracy over epochs, offering different perspectives on training dynamics. Additionally, TensorBoard can display histograms, embeddings, and specialized visualizations based on logged information.

In this blog, we have covered how can we visualize training progress for deep learning framework using matplotlib and tensorboard.

How to visualize training progress in PyTorch?

Deep learning and understanding the mechanics of learning and progress during training is vital to optimize performance while diagnosing problems such as underfitting or overfitting. The process of visualizing training progress offers valuable insights into the dynamics of learning that allow us to make sound decisions. In this article, we will learn how to visualize the training progress in Pytorch.

Two methods by which training progress must be visualized are:

Using Matplotlib
Using Tensor Board

Tags:

#AI-ML-DS With Python #Python-PyTorch #AI-ML-DS #Deep Learning