Architecture of LeNet-5

LeNet-5 Architecture for Digit Recognition

The architecture of LeNet 5 contains 7 layers excluding the input layer. Here is a detailed breakdown of the LeNet-5 architecture:

1. Input Layer

Input Size: 32×32 pixels.
The input is larger than the largest character in the database, which is at most 20×20 pixels, centered in a 28×28 field. The larger input size ensures that distinctive features such as stroke endpoints or corners can appear in the center of the receptive field of the highest-level feature detectors.
Normalization: Input pixel values are normalized such that the background (white) corresponds to a value of 0, and the foreground (black) corresponds to a value of 1. This normalization makes the mean input roughly 0 and the variance roughly 1, which accelerates the learning process.

2. Layer C1 (Convolutional Layer)

Feature Maps: 6 feature maps.
Connections: Each unit is connected to a 5×5 neighborhood in the input, producing 28×28 feature maps to prevent boundary effects.
Parameters: 156 trainable parameters and 117,600 connections.

3. Layer S2 (Subsampling Layer)

Feature Maps: 6 feature maps.
Size: 14×14 (each unit connected to a 2×2 neighborhood in C1).
Operation: Each unit adds four inputs, multiplies by a trainable coefficient, adds a bias, and applies a sigmoid function.
Parameters: 12 trainable parameters and 5,880 connections.

Partial Connectivity: C3 is not fully connected to S2, which limits the number of connections and breaks symmetry, forcing feature maps to learn different, complementary features.

4. Layer C3 (Convolutional Layer)

Feature Maps: 16 feature maps.
Connections: Each unit is connected to several 5×5 neighborhoods at identical locations in a subset of S2’s feature maps.
Parameters and Connections: Connections are partially connected to force feature maps to learn different features, with 1,516 trainable parameters and 151,600 connections.

5. Layer S4 (Subsampling Layer)

Feature Maps: 16 feature maps.
Size: 7×7 (each unit connected to a 2×2 neighborhood in C3).
Parameters: 32 trainable parameters and 2,744 connections.

6. Layer C5 (Convolutional Layer)

Feature Maps: 120 feature maps.
Size: 1×1 (each unit connected to a 5×5 neighborhood on all 16 of S4’s feature maps, effectively fully connected due to input size).
Parameters: 48,000 trainable parameters and 48,000 connections.

7. Layer F6 (Fully Connected Layer)

Units: 84 units.
Connections: Each unit is fully connected to C5, resulting in 10,164 trainable parameters.
Activation: Uses a scaled hyperbolic tangent function [Tex]f(a) = A\tan (Sa)[/Tex], where A = 1.7159 and S = 2/3

8. Output Layer

In the output layer of LeNet, each class is represented by an Euclidean Radial Basis Function (RBF) unit. Here’s how the output of each RBF unit [Tex]y_i[/Tex]is computed:

[Tex]y_i = \sum_{j} x_j . w_{ij}[/Tex]

In this equation:

[Tex]x_j[/Tex] represents the inputs to the RBF unit.
[Tex]w_{ij}[/Tex] represents the weights associated with each input.
The summation is over all inputs to the RBF unit.

In essence, the output of each RBF unit is determined by the Euclidean distance between its input vector and its parameter vector. The larger the distance between the input pattern and the parameter vector, the larger the RBF output. This output can be interpreted as a penalty term measuring the fit between the input pattern and the model of the class associated with the RBF unit.

LeNet-5 Architecture

In the late 1990s, Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner created a convolutional neural network (CNN) based architecture called LeNet. The LeNet-5 architecture was developed to recognize handwritten and machine-printed characters, a function that showcased the potential of deep learning in practical applications. This article provides an in-depth exploration of the LeNet-5 architecture, examining each component and its contribution in deep learning.

Tags:

#Data Science Blogathon 2024 #AI-ML-DS #Blogathon #Computer Vision

Introduction to LeNet-5

Detailed Explanation of the Layers