Architecture of LeNet-5


LeNet-5 Architecture for Digit Recognition


The architecture of LeNet 5 contains 7 layers excluding the input layer. Here is a detailed breakdown of the LeNet-5 architecture:

1. Input Layer

  • Input Size: 32×32 pixels.
  • The input is larger than the largest character in the database, which is at most 20×20 pixels, centered in a 28×28 field. The larger input size ensures that distinctive features such as stroke endpoints or corners can appear in the center of the receptive field of the highest-level feature detectors.
  • Normalization: Input pixel values are normalized such that the background (white) corresponds to a value of 0, and the foreground (black) corresponds to a value of 1. This normalization makes the mean input roughly 0 and the variance roughly 1, which accelerates the learning process.

2. Layer C1 (Convolutional Layer)

  • Feature Maps: 6 feature maps.
  • Connections: Each unit is connected to a 5×5 neighborhood in the input, producing 28×28 feature maps to prevent boundary effects.
  • Parameters: 156 trainable parameters and 117,600 connections.

3. Layer S2 (Subsampling Layer)

  • Feature Maps: 6 feature maps.
  • Size: 14×14 (each unit connected to a 2×2 neighborhood in C1).
  • Operation: Each unit adds four inputs, multiplies by a trainable coefficient, adds a bias, and applies a sigmoid function.
  • Parameters: 12 trainable parameters and 5,880 connections.

Partial Connectivity: C3 is not fully connected to S2, which limits the number of connections and breaks symmetry, forcing feature maps to learn different, complementary features.

4. Layer C3 (Convolutional Layer)

  • Feature Maps: 16 feature maps.
  • Connections: Each unit is connected to several 5×5 neighborhoods at identical locations in a subset of S2’s feature maps.
  • Parameters and Connections: Connections are partially connected to force feature maps to learn different features, with 1,516 trainable parameters and 151,600 connections.

5. Layer S4 (Subsampling Layer)

  • Feature Maps: 16 feature maps.
  • Size: 7×7 (each unit connected to a 2×2 neighborhood in C3).
  • Parameters: 32 trainable parameters and 2,744 connections.

6. Layer C5 (Convolutional Layer)

  • Feature Maps: 120 feature maps.
  • Size: 1×1 (each unit connected to a 5×5 neighborhood on all 16 of S4’s feature maps, effectively fully connected due to input size).
  • Parameters: 48,000 trainable parameters and 48,000 connections.

7. Layer F6 (Fully Connected Layer)

  • Units: 84 units.
  • Connections: Each unit is fully connected to C5, resulting in 10,164 trainable parameters.
  • Activation: Uses a scaled hyperbolic tangent function [Tex]f(a) = A\tan (Sa)[/Tex], where A = 1.7159 and S = 2/3

8. Output Layer

In the output layer of LeNet, each class is represented by an Euclidean Radial Basis Function (RBF) unit. Here’s how the output of each RBF unit [Tex]y_i[/Tex]is computed:

[Tex]y_i = \sum_{j} x_j . w_{ij}[/Tex]

In this equation:

  • [Tex]x_j[/Tex] represents the inputs to the RBF unit.
  • [Tex]w_{ij}[/Tex] represents the weights associated with each input.
  • The summation is over all inputs to the RBF unit.

In essence, the output of each RBF unit is determined by the Euclidean distance between its input vector and its parameter vector. The larger the distance between the input pattern and the parameter vector, the larger the RBF output. This output can be interpreted as a penalty term measuring the fit between the input pattern and the model of the class associated with the RBF unit.

LeNet-5 Architecture

In the late 1990s, Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner created a convolutional neural network (CNN) based architecture called LeNet. The LeNet-5 architecture was developed to recognize handwritten and machine-printed characters, a function that showcased the potential of deep learning in practical applications. This article provides an in-depth exploration of the LeNet-5 architecture, examining each component and its contribution in deep learning.

Similar Reads

Introduction to LeNet-5

LeNet-5 is a convolutional neural network (CNN) architecture that introduced several key features and innovations that have become standard in modern deep learning. It demonstrated the effectiveness of CNNs for image recognition tasks and introduced key concepts such as convolution, pooling, and hierarchical feature extraction that underpin modern deep learning models....

Architecture of LeNet-5

...

Detailed Explanation of the Layers

Convolutional Layers (Cx): These layers apply convolution operations to the input, using multiple filters to extract different features. The filters slide over the input image, computing the dot product between the filter weights and the input pixels. This process captures spatial hierarchies of features, such as edges and textures.Subsampling Layers (Sx): These layers perform pooling operations (average pooling in the case of LeNet-5) to reduce the spatial dimensions of the feature maps. This helps to control overfitting, reduce the computational load, and make the representation more compact.Fully Connected Layers (Fx): These layers are densely connected, meaning each neuron in these layers is connected to every neuron in the previous layer. This allows the network to combine features learned in previous layers to make final predictions....

Contact Us