Abstract
Non-negative matrix factorization (NMF) has been widely employed in computer vision and pattern recognition fields since the learned bases can be interpreted as a natural parts-based representation of the input space, which is consistent with the psychological intuition of combining parts to form a whole. In this paper, we propose a novel constrained nonnegative matrix factorization algorithm, called the graph regularized discriminative non-negative matrix factorization (GDNMF), to incorporate into the NMF model both intrinsic geometrical structure and discriminative information which have been essentially ignored in prior works. Specifically, both the graph Laplacian and supervised label information are jointly utilized to learn the projection matrix in the new model. Further we provide the corresponding multiplicative update solutions for the optimization framework, together with the convergence proof. A series of experiments are conducted over several benchmark face datasets to demonstrate the efficacy of our proposed GDNMF.
Similar content being viewed by others
References
Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Process Syst 14:585–591
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Berry M, Browne M, Langville A, Pauca V, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1):155–173
Brunet J-P, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169
Cai D, He X, Han J, Huang T (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Chung FRK (1997) Spectral graph theory (Conference Board of the Mathematical Sciences, Regional Conference Series in Mathematics, No. 92), American Mathematical Society
Demrsrsa A, Lamb N, Rubin D (1977) Maximum likelihood from incomplete data via the em algorithm. J Roy Stat Soc Ser B (Methodological) 39(1):1–38
Donoho DL, Stodden VC (2004) When does non-negative matrix factorization give a correct decomposition into parts? In: Advances in neural information processing systems 16: proceedings of the 2003 conference. MIT Press
Graham DB, Allinson NM (1998) Characterising virtual eigensignatures for general purpose face recognition. In: Face recognition. Springer, pp 446–456
Guan N, Tao D, Luo Z, Yuan B (2011) Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Trans Image Process 20(7):2030–2048
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 1735–1742
Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Adv Neural Inform Process Syst 13:556–562
Li S, Hou X, Zhang H, Cheng Q (2001) Learning spatially localized, parts-based representation. In: IEEE conference on computer vision and pattern recognition, vol 1, pp I–207
Logothetis N, Sheinberg D (1996) Visual object recognition. Ann Rev Neurosci 19(1):577–621
Palmer S (1977) Hierarchical structure in perceptual representation. Cognitive Psychol 9(4):441–474
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142
Sim T, Baker S, Bsat M (2003) The cmu pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
Tenenbaum J, De Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Wachsmuth E, Oram M, Perrett D (1994) Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cereb Cortex 4(5):509–522
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 267–273
Zhang Q, Li B (2010) Discriminative k-svd for dictionary learning in face recognition. In: IEEE conference on computer vision and pattern recognition, pp 2691–2698
Acknowledgements
This work is supported in part by the National Basic Research Program of China (973 program) under Grant 2009CB320901, the National Natural Science Foundation of China under Grant 61272247, the National High Technology Research and Development Program of China (863 program) under Grant 2008AA02Z310, the European Union Seventh Framework Programme under Grant 247619, the Shanghai Committee of Science and Technology under Grant 08411951200, and the Innovation Ability Special Fund of Shanghai Jiao Tong University under Grant Z030026.
Author information
Authors and Affiliations
Corresponding author
Appendix (Proof of Theorem)
Appendix (Proof of Theorem)
In order to prove Theorem, we need to show that O 1 is non-increasing under the updating steps in (19), (20) and (21). For the objective function O 1, we need to fix H and A if we update W, so, the first term of O 1 exists. Similarly, we need to fix W and H if we update A, the third term of O 1 exists. Therefore, we have exactly the same update formula for W and A in GDNMF as in the original NMF. Thus, we can use the convergence proof of NMF to show that O 1 is nonincreasing under the update step in (20) and (21). These details can be found in [14].
Hence, we only need to prove that O 1 is non-increasing under the updating step in (19). We follow the similar process depicted in [14]. Our proof make use of an auxiliary function similar to that used in the Expectation-Maximization algorithm [8]. We first give the definition of the auxiliary function.
Definition
G(h, h ′) is an auxiliary function of F(h) if the following conditions are satisfied.
The above auxiliary function is very important because of the following lemma.
Lemma 1
If G is an auxiliary function of F, then F is non-increasing under the update
Proof
F(h (t + 1)) ≤ G(h (t + 1), h (t)) ≤ G(h (t), h (t)) = F(h (t))
Now, we show that the update step for H in (19) is exactly the update in (23) with a proper auxiliary function.
Considering any element h ab in H, we use F ab to denote the part of O 1 which is only relevant to h ab . It is easy to obtain the following derivatives.
It is enough to show that each F ab is nonincreasing under the update step of (19) because our update is essentially element-wise. Consequently, we introduce the following lemma. □
Lemma 2
Function
is an auxiliary function of F ab .
Proof
We only need to prove that \(G(h,h_{ab}^{(t)})\geq F_{ab}(h)\) because G(h,h) = F ab (h) is obvious. Therefore, we first consider the Taylor series expansion of F ab (h).
We compare the (27) with (26) to find that \(G(h,h_{ab}^{(t)})\geq F_{ab}(h)\) is equivalent to
In fact, we have
and
Thus, (28) holds and \(G(h,h_{ab}^{(t)})\geq F_{ab}(h)\). We can now demonstrate the convergence of Theorem: □
Proof of Theorem
Replacing \(G(h,h_{ab}^{(t)})\) in (23) by (26) results in the following update rule:
Since (26) is an auxiliary function and F ab is nonincreasing under this update rule. □
Rights and permissions
About this article
Cite this article
Long, X., Lu, H., Peng, Y. et al. Graph regularized discriminative non-negative matrix factorization for face recognition. Multimed Tools Appl 72, 2679–2699 (2014). https://doi.org/10.1007/s11042-013-1572-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-013-1572-z