SVD and Eigenfaces
SVD and Eigenfaces
Eigenface is a term first introduced by Sirovich and Kirby in 1987, which is a set of feature basis obtained by principal component analysis (PCA) building on singular value decomposition (SVD), to project the higher-dimensional face-image space to a lower dimension. So that a set of r basic features, which are called eigenpictures, and in this case, eigenfaces, can be used to perform linear combinations to achieve the total number of n face images, where inline_formula not implemented.
Eigenface is the fundamental concept used for facial recognition. Later on, various preprocessing algorithms, such as gamma correction for illumination preprocessing, were built upon it to improve accuracy.
SVD
Singular value decomposition is to find 3 factors inline_formula not implemented, inline_formula not implemented, and inline_formula not implemented of a real matrix inline_formula not implemented, where inline_formula not implemented and inline_formula not implemented are orthogonal matrices, and Σ is a diagonal matrix and the entries on the diagonal are called singular values.
The singular values in the inline_formula not implemented matrix are in descending order, i.e., if a matrix inline_formula not implemented has diagonal entries inline_formula not implemented from left to right,
then,
inline_formula not implemented
For an m×n matrix M, with n linearly independent columns (i.e., rank n), the dimensions of its full singular value decomposition matrices are:
formula not implementedformula not implementedHowever, even if inline_formula not implemented, there will only be n non-trivial entries of inline_formula not implementeds in inline_formula not implemented, so the SVD matrices of inline_formula not implemented will be equivalent to consider only the first inline_formula not implemented rows of inline_formula not implemented and inline_formula not implemented columns of inline_formula not implemented since other entries of the product matrix of these 3 matrices will be inline_formula not implemented's. Such SVD is called economy-sized decomposition.
In the real-life scenario, an image of a person’s face can have millions of pixels with the development of photographic equipment. However, in that two-million-dimension vector space, there are directions containing much less information than others, so we would not want to analyze or compute or train algorithms with every bit of info in the two-million-dimension vector space; otherwise, it will increase the cost of time and space due to the amount of computation.
Singular value decomposition can be used to compress images by truncating SVD matrices to lower dimensions. Since the components with the most important information are ordered to be in the front, we can use the first r rows and columns to compress an image to reduce not-as-important information in the picture, by truncating the SVD matrices to rank inline_formula not implemented.
I will use a picture of my cat with a resolution of this picture is inline_formula not implemented as an example of such approximation.
The components in this image include the cat (“mooncake”), its leash, a part of a trunk, and snow. Now we can use SVD to compress this picture to see if those pieces of information are reserved. SVD can be computed with linear algebra (linalg
) package under numpy
in Python.
We can first plot the singular values to see how much information each value has. And as shown in figure 5 and 6, it is obvious that the first couple of dozens of columns in SVD matrices will give us the majority of information from the original picture.
As shown in figure 6, when we truncate the SVD matrices by keeping inline_formula not implemented singular values, when inline_formula not implemented, an image of color blocks is reproduced, where we can identify an animal, may be a rabbit, and an object on the right.
Moving on, the information we identified from the original picture is already able to be found with first 20 components. And by keeping 100 singular values, the compressed image is quite clear.
Using Yale Face Database B
There are 38 human objects in total from Yale Face Database B and Extended Yale Face Database B. The first inline_formula not implemented objects can be used as a training set and the last two faces can be used as a test set.
Every face image has inline_formula not implemented pixels and images are aligned so that eyebrows, eyes, noses, and mouths are at a similar position in the pictures of the dataset. Face images are then being stretched to column vectors and all face-vectors are horizontally concatenated to construct a face matrix inline_formula not implemented.
To perform PCA, an average face needs to be calculated by finding the mean of all dimensions. The average face can be reshaped as an image after computing.
Then subtract the mean from the face matrix inline_formula not implemented to recenter data around the origin. We can name the resulting matrix inline_formula not implemented.
After finding the singular value decomposition of matrix A, column vectors in U are eigenvectors of the covariance matrix of A (A×), which is also called eigenfaces.
The first 8 eigenfaces can be found by reshaping the first 8 columns of U (U[:, 0:7]
) to size inline_formula not implemented matrices:
As shown in figure 10, the first principal component gives a general shape of a human face. As defined in PCA, the principal components are ordered descendingly by how much variances each principal component captures, which is the reason that eigenface 1 looks like all human faces with only the elliptical shapes of face and eyes and showing the existence of nose and mouth. The columns after that capture more features such as lips, eyebrows, shadows, and muscles on the face.
We can project new faces that were not used in the training dataset to the first r eigenfaces to store data with a lower volume of memory by inline_formula not implemented, and use inline_formula not implemented eigenfaces as the basis for linear combinations to achieve an approximation of these new faces.
As we can see, when inline_formula not implemented, the approximated picture is already good enough to reproduce the face of subject inline_formula not implemented, so that the computation cost is reduced by a large amount compared to the inline_formula not implemented-dimension vector space of the original images.
To differentiate people, for example, person 37 and 38 who were not in the training set, we can project them onto a plane given by two principal components because there are only two people to choose from.
After plotting singular values, figure 14 tells us by the color, that the first four singular values are much larger than the rest of them, thus, the first four columns of the matrix inline_formula not implemented would capture a lot more variations than the rest of the columns, resulting in the worse split between points of different classes (faces).
Hence, we can project the mean-subtracted 37th and 38th faces after stretching them to column vectors, to the plane constructed by the eigenfaces 5 and 6, i.e., column numbers 4 and 5 in inline_formula not implemented since column numbers start from inline_formula not implemented in Python.
The coordinates of points is calculated by: inline_formula not implemented
Here is the resulting plot.
After projection, Figure 15 tells us that most of the data of object 37 lie in the positive region of PC6, and object 38 has data that are more negative in PC6, while having a larger standard deviation in principal component 6. From here, we can train algorithms to recognize these faces with more principal components included.
Conclusion
This is an overview of what is eigenface, how SVD helps to construct it, and how it can be applied in image processing and face recognition areas. As I mentioned, this is only the fundamental step towards industrial applications, and various algorithms for preprocessing images are being developed.
References:
Hu, C., Lu, X., Ye, M., & Zeng, W. (2017). Singular value decomposition and local near neighbors for face recognition under varying illumination. Pattern Recognition, 64, 60–83. https://doi.org/10.1016/j.patcog.2016.10.029
Steve Brunton. (2020, January 19). Singular Value Decomposition [Video]. YouTube. https://www.youtube.com/playlist?list=PLMrJAkhIeNNSVjnsviglFoY2nXildDCcv
Georghiades, A.S. and Belhumeur, P.N. and Kriegman, D.J. From Few to Many: Illumination Cone Models for Face Recognition under Variable Lighting and Pose. IEEE Trans. Pattern Anal. Mach. Intelligence 23(6):643-660 (2001).