Implementing Faces Dataset Decompositions
1.Import necessary libraries:
Python3
import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import fetch_lfw_people from sklearn.decomposition import PCA |
The necessary libraries are imported in this step: NumPy for numerical operations, Matplotlib for charting, and Scikit-Learn for PCA implementation and access to the Faces dataset.
2.Load the Faces dataset:
Python3
faces_data = fetch_lfw_people(min_faces_per_person = 70 , resize = 0.4 ) |
The code uses Scikit-Learn’s fetch_lfw_people method to get the Labeled Faces in the Wild (LFW) dataset. The photographs are resized to 40% of their original size, and the minimum number of faces per person is set at 70.
3.Preprocess the data:
Python3
X = faces_data.data n_samples, n_features = X.shape |
In this stage, the feature matrix X is extracted from the dataset, and the number of features (n_features) and samples (n_samples) in the dataset are calculated.
4.Apply PCA for decomposition:
Python3
n_components = 150 pca = PCA(n_components = n_components, svd_solver = 'randomized' , whiten = True ).fit(X) |
The code applies PCA to the data using the fit technique and sets the number of components (n_components) for PCA to 150. For efficiency, we use a randomized solution, whitening the data in the process.
5.Visualize eigenfaces:
Python3
eigenfaces = pca.components_.reshape( (n_components, faces_data.images.shape[ 1 ], faces_data.images.shape[ 2 ])) |
In this stage, the principal components from PCA are transformed into the form of pictures, or eigenfaces. The directions of highest variance in the original face pictures are represented by these eigenfaces.
6.Plot the first 10 eigenfaces:
Python3
plt.figure(figsize = ( 10 , 3 )) for i in range ( 10 ): plt.subplot( 2 , 5 , i + 1 ) plt.imshow(eigenfaces[i], cmap = 'gray' ) plt.title(f "Eigenface {i + 1}" ) plt.show() |
Output:
The code uses Matplotlib to plot the first ten eigenfaces, visualizing them in a 2×5 grid.
7.Reconstruct faces using a subset of principal components:
Python3
n_faces = 5 random_faces_indices = np.random.randint( 0 , n_samples, n_faces) random_faces = X[random_faces_indices] |
Five faces are chosen at random from the dataset in this section to illustrate the reconstruction procedure.
8.Transform faces into principal components:
Python3
faces_pca = pca.transform(random_faces) |
With the previously fitted PCA model, the chosen faces are converted into the space of principle components.
9.Reconstruct faces from principal components:
Python3
faces_reconstructed = pca.inverse_transform(faces_pca) |
The inverse_transform function is used by the algorithm to recreate the faces from the changed main components.
10.Visualize original and reconstructed faces:
Python3
plt.figure(figsize = ( 10 , 3 )) for i in range (n_faces): plt.subplot( 2 , n_faces, i + 1 ) plt.imshow(random_faces[i].reshape( faces_data.images.shape[ 1 ], faces_data.images.shape[ 2 ]), cmap = 'gray' ) plt.title( "Original" ) plt.subplot( 2 , n_faces, i + 1 + n_faces) plt.imshow(faces_reconstructed[i].reshape( faces_data.images.shape[ 1 ], faces_data.images.shape[ 2 ]), cmap = 'gray' ) plt.title( "Reconstructed" ) plt.show() |
Output:
Similarly, we can perform Non-Negative Matrix Factorization (NMF).
Non-Negative Matrix Factorization (NMF)
Non-Negative Matrix Factorization (NMF) is a mathematical technique used in machine learning and data analysis for dimensionality reduction and feature extraction. It is particularly useful when the data involved has non-negative values, such as images, audio spectrograms, or text data represented as term-document matrices.
In the following code snippet, we have demonstrated how NMF can be used for facial image decomposition and reconstruction. Through visualizations help in understanding the learned facial features and the effectiveness of the NMF model in reconstructing faces from the reduced feature space. Adjusting parameters such as the number of components (n_components
) can impact the quality of reconstruction.
Python3
from sklearn.decomposition import NMF nmf = NMF(n_components = n_components, tol = 5e - 3 ) nmf.fit(X) # original non- negative dataset # Visualize nmf_faces = nmf.components_.reshape( (n_components, faces_data.images.shape[ 1 ], faces_data.images.shape[ 2 ])) # Plot the first 10 faces plt.figure(figsize = ( 10 , 3 )) for i in range ( 10 ): plt.subplot( 2 , 5 , i + 1 ) plt.imshow(nmf_faces[i], cmap = 'gray' ) plt.title(f "NMF face {i + 1}" ) plt.show() # Reconstruct faces n_faces = 5 random_faces_indices = np.random.randint( 0 , n_samples, n_faces) random_faces = X[random_faces_indices] # Transform faces faces_nmf = nmf.transform(random_faces) # Reconstruct faces faces_reconstructed = nmf.inverse_transform(faces_nmf) # Visualize original and reconstructed faces plt.figure(figsize = ( 10 , 3 )) for i in range (n_faces): plt.subplot( 2 , n_faces, i + 1 ) plt.imshow(random_faces[i].reshape( faces_data.images.shape[ 1 ], faces_data.images.shape[ 2 ]), cmap = 'gray' ) plt.title( "Original" ) plt.subplot( 2 , n_faces, i + 1 + n_faces) plt.imshow(faces_reconstructed[i].reshape( faces_data.images.shape[ 1 ], faces_data.images.shape[ 2 ]), cmap = 'gray' ) plt.title( "Reconstructed" ) plt.show() |
Output:
Faces dataset decompositions in Scikit Learn
The Faces dataset is a database of labeled pictures of people’s faces that can be found in the well-known machine learning toolkit Scikit-Learn. Face recognition, facial expression analysis, and other computer vision applications are among the frequent uses for it. The Labeled Faces in the Wild (LFW) benchmark includes the dataset.
Contact Us