Stepwise Guide to Save and Load Models in PyTorch

Now, we will see how to create a Model using the PyTorch.

Creating Model in PyTorch

To save and load the model, we will first create a Deep-Learning Model for the image classification. This model will classify the images of the handwritten digits from the MNIST Dataset. The below code implements the Convolutional Neural Network for image classification. The data is loaded and transformed into PyTorch Sensors, which are like containers to store the data.

The following code shows the creation of the PyTorch Model.

Importing Necessary Libraries

Python3




import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
from torch.utils.data import DataLoader


Data Transformation

The given code defines a transformation pipeline using torchvision.transforms.Compose for preprocessing image data before feeding it into a PyTorch model.

  • transforms.ToTensor(): Converts the input image (assumed to be in PIL Image format) to a PyTorch tensor. It converts the image data type to torch.FloatTensor and scales the pixel values to the range [0.0, 1.0].
  • transforms.Normalize((0.5,), (0.5,)): Normalizes the tensor image with mean and standard deviation. The provided mean and standard deviation values (0.5,) and (0.5,) respectively are used to normalize each channel of the input tensor. This transformation normalizes the tensor values to be in the range [-1.0, 1.0].

Python3




# Define transformation to apply to the data
data_transform = transforms.Compose([
    transforms.ToTensor(),  # Convert images to PyTorch tensors
    transforms.Normalize((0.5,), (0.5,))  # Normalize the pixel values to range [-1, 1]
])
 
# Download MNIST dataset and apply the transformation
train_dataset = torchvision.datasets.MNIST(root='./data', train=True, transform=data_transform, download=True)
test_dataset = torchvision.datasets.MNIST(root='./data', train=False, transform=data_transform, download=True)
 
 
# Define data loaders to load the data in batches during training and testing
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)


Defining neural network architecture

  1. Class Definition: The code defines a class SimpleCNN that inherits from nn.Module, which is the base class for all neural network modules in PyTorch. This class represents a simple convolutional neural network (CNN).
  2. Initialization: In the __init__ method, the code defines the layers of the CNN. It includes two convolutional layers (conv1_layer and conv2_layer) with specified kernel sizes and padding, and two fully connected layers (fc1_layer and fc2_layer) with specified input and output sizes.
  3. Forward Pass: The forward method defines the forward pass of the network. It applies a ReLU activation function after each convolutional layer and uses max pooling with a kernel size of 2 and stride of 2 to downsample the feature maps. The output of the second convolutional layer is flattened before being passed to the fully connected layers.
  4. View Operation: The view operation reshapes the output of the second convolutional layer to be compatible with the input size of the first fully connected layer. The -1 argument in view indicates that the size of that dimension should be inferred based on the other dimensions.
  5. Model Instance: Finally, an instance of the SimpleCNN class is created and assigned to the variable cnn_model. This instance represents the actual neural network that can be trained and used for inference.

Python3




# Here we are adding convolution layer and fully connected layers in neural network
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1_layer = nn.Conv2d(1, 16, kernel_size=3, padding=1)
        self.conv2_layer = nn.Conv2d(16, 32, kernel_size=3, padding=1)
        self.fc1_layer = nn.Linear(32 * 7 * 7, 128)
        self.fc2_layer = nn.Linear(128, 10)
 
    # Adding ReLU Activation function Max Pooling Layer
    def forward(self, inputs):
        new_input = torch.relu(self.conv1_layer(inputs))
        new_input = torch.max_pool2d(new_input, kernel_size=2, stride=2)
        new_input = torch.relu(self.conv2_layer(new_input))
        new_input = torch.max_pool2d(new_input, kernel_size=2, stride=2)
        new_input = new_input.view(-1, 32 * 7 * 7)
        new_input = torch.relu(self.fc1_layer(new_input))
        new_input = self.fc2_layer(new_input)
        return new_input
 
 
# Creating Model Instance
cnn_model = SimpleCNN()


Loss Function and Optimizer

  1. Loss Function: nn.CrossEntropyLoss() is used as the loss function. This loss function is commonly used for classification problems with multiple classes. It calculates the cross-entropy loss between the predicted class probabilities and the actual class labels.
  2. Optimizer: optim.Adam is used as the optimizer. Adam is a popular optimization algorithm that computes adaptive learning rates for each parameter. It is well-suited for training deep neural networks. The optimizer is initialized with the parameters of the cnn_model and a learning rate of 0.001.

Python3




# Define loss function and optimizer
loss_func = nn.CrossEntropyLoss()
optimizer = optim.Adam(cnn_model.parameters(), lr=0.001)


Training the model

The code implements the following steps:

  1. Outer Loop (Epochs): The code iterates over 5 epochs using a for loop. An epoch is a single pass through the entire dataset.
  2. Inner Loop (Batches): Within each epoch, the code iterates over batches of data using train_loader, which presumably contains batches of input data (inputs) and their corresponding labels (labels).
  3. Zero Gradients: Before the backward pass (loss.backward()), optimizer.zero_grad() is called to zero out the gradients of the model parameters. This is necessary because PyTorch accumulates gradients by default.
  4. Forward and Backward Pass:
    • outputs = cnn_model(inputs) performs the forward pass, where the model processes the input data to generate predictions (outputs).
    • loss = loss_func(outputs, labels) calculates the loss between the predicted outputs and the actual labels.
    • loss.backward() computes the gradients of the loss with respect to the model parameters, enabling backpropagation.
    • optimizer.step() updates the model parameters based on the computed gradients, using the optimization algorithm (Adam in this case) to adjust the weights.
  5. Loss Calculation: Within the inner loop, running_loss accumulates the total loss across batches. At the end of each epoch, the average loss per batch is printed to monitor the training progress.

Python3




# Train model
for epoch in range(5):  # Train for 5 epochs
    running_loss = 0.0
    for inputs, labels in train_loader:
        optimizer.zero_grad()  # Zero the gradients
        outputs = cnn_model(inputs)  # Forward pass
        loss = loss_func(outputs, labels)  # Calculate the loss
        loss.backward()  # Backward pass
        optimizer.step()  # Update weights
 
 
        running_loss += loss.item()
    print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")


Output:

Epoch 1, Loss: 0.22154594235159933
Epoch 2, Loss: 0.05766747533348697
Epoch 3, Loss: 0.04144403319505514
Epoch 4, Loss: 0.029859573355312946
Epoch 5, Loss: 0.024109310584392515

Testing The Model

Python




# Test model
correct_predictions = 0
total_samples = 0
with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = cnn_model(inputs)
        _, predicted_labels = torch.max(outputs.data, 1)
        total_samples += labels.size(0)
        correct_predictions += (predicted_labels == labels).sum().item()
 
print(f"Accuracy of test set: {100 * correct_predictions / total_samples}%")


Output:

Accuracy of test set: 99.16%

Save and Load Models in PyTorch

It often happens that we need to use the already-trained models to perform some operations in our development environment. In this case, would you create the model again and again? Or, you will save the model somewhere else and load it as per the requirement. You would definitely choose the second option. So in this article, we will see how to implement the concept of saving and loading the models using PyTorch.

Table of Content

  • What is PyTorch?
  • Stepwise Guide to Save and Load Models in PyTorch
  • Saving and Loading Model
  • Frequently Asked Questions

Similar Reads

What is PyTorch?

PyTorch is an open-source Machine Learning Library that works on the dynamic computation graph. In the static computation approach, the models are predefined before the execution. But in dynamic computation which PyTorch follows, the structure of the graph in the Neural Network can change during the execution based on the input data. Hence, It allows to creation and training the Neural Networks to extract hidden patterns from the data....

Stepwise Guide to Save and Load Models in PyTorch

Now, we will see how to create a Model using the PyTorch....

Saving and Loading Model

...

Conclusion

...

Frequently Asked Questions

...

Contact Us