Graph regularized discriminative non-negative matrix factorization for face recognition

Xianzhong Long¹,
Hongtao Lu¹,
Yong Peng¹ &
…
Wenbin Li²

822 Accesses
51 Citations
Explore all metrics

Abstract

Non-negative matrix factorization (NMF) has been widely employed in computer vision and pattern recognition fields since the learned bases can be interpreted as a natural parts-based representation of the input space, which is consistent with the psychological intuition of combining parts to form a whole. In this paper, we propose a novel constrained nonnegative matrix factorization algorithm, called the graph regularized discriminative non-negative matrix factorization (GDNMF), to incorporate into the NMF model both intrinsic geometrical structure and discriminative information which have been essentially ignored in prior works. Specifically, both the graph Laplacian and supervised label information are jointly utilized to learn the projection matrix in the new model. Further we provide the corresponding multiplicative update solutions for the optimization framework, together with the convergence proof. A series of experiments are conducted over several benchmark face datasets to demonstrate the efficacy of our proposed GDNMF.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Multiple Graph Regularized Non-negative Matrix Factorization Based on L2,1 Norm for Face Recognition

Robust automated graph regularized discriminative non-negative matrix factorization

Article 30 January 2021

An improve face representation and recognition method based on graph regularized non-negative matrix factorization

Article 10 April 2019

References

Belhumeur PN, Hespanha JP, Kriegman DJ (1997) Eigenfaces vs. fisherfaces: recognition using class specific linear projection. IEEE Trans Pattern Anal Mach Intell 19(7):711–720
Article Google Scholar
Belkin M, Niyogi P (2001) Laplacian eigenmaps and spectral techniques for embedding and clustering. Adv Neural Inform Process Syst 14:585–591
Google Scholar
Belkin M, Niyogi P (2003) Laplacian eigenmaps for dimensionality reduction and data representation. Neural Comput 15(6):1373–1396
Article MATH Google Scholar
Berry M, Browne M, Langville A, Pauca V, Plemmons R (2007) Algorithms and applications for approximate nonnegative matrix factorization. Comput Stat Data Anal 52(1):155–173
Article MATH MathSciNet Google Scholar
Brunet J-P, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169
Article Google Scholar
Cai D, He X, Han J, Huang T (2011) Graph regularized nonnegative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(8):1548–1560
Article Google Scholar
Chung FRK (1997) Spectral graph theory (Conference Board of the Mathematical Sciences, Regional Conference Series in Mathematics, No. 92), American Mathematical Society
Demrsrsa A, Lamb N, Rubin D (1977) Maximum likelihood from incomplete data via the em algorithm. J Roy Stat Soc Ser B (Methodological) 39(1):1–38
Google Scholar
Donoho DL, Stodden VC (2004) When does non-negative matrix factorization give a correct decomposition into parts? In: Advances in neural information processing systems 16: proceedings of the 2003 conference. MIT Press
Graham DB, Allinson NM (1998) Characterising virtual eigensignatures for general purpose face recognition. In: Face recognition. Springer, pp 446–456
Guan N, Tao D, Luo Z, Yuan B (2011) Manifold regularized discriminative nonnegative matrix factorization with fast gradient descent. IEEE Trans Image Process 20(7):2030–2048
Article MathSciNet Google Scholar
Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. In: IEEE conference on computer vision and pattern recognition, vol 2, pp 1735–1742
Lee D, Seung H (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
Article Google Scholar
Lee D, Seung H (2001) Algorithms for non-negative matrix factorization. Adv Neural Inform Process Syst 13:556–562
Google Scholar
Li S, Hou X, Zhang H, Cheng Q (2001) Learning spatially localized, parts-based representation. In: IEEE conference on computer vision and pattern recognition, vol 1, pp I–207
Logothetis N, Sheinberg D (1996) Visual object recognition. Ann Rev Neurosci 19(1):577–621
Article Google Scholar
Palmer S (1977) Hierarchical structure in perceptual representation. Cognitive Psychol 9(4):441–474
Article Google Scholar
Roweis S, Saul L (2000) Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500):2323–2326
Article Google Scholar
Samaria FS, Harter AC (1994) Parameterisation of a stochastic model for human face identification. In: Proceedings of the second IEEE workshop on applications of computer vision, pp 138–142
Sim T, Baker S, Bsat M (2003) The cmu pose, illumination, and expression database. IEEE Trans Pattern Anal Mach Intell 25(12):1615–1618
Article Google Scholar
Tenenbaum J, De Silva V, Langford J (2000) A global geometric framework for nonlinear dimensionality reduction. Science 290(5500):2319–2323
Article Google Scholar
Wachsmuth E, Oram M, Perrett D (1994) Recognition of objects and their component parts: responses of single units in the temporal cortex of the macaque. Cereb Cortex 4(5):509–522
Article Google Scholar
Xu W, Liu X, Gong Y (2003) Document clustering based on non-negative matrix factorization. In: Proceedings of the 26th annual international ACM SIGIR conference on research and development in informaion retrieval, pp 267–273
Zhang Q, Li B (2010) Discriminative k-svd for dictionary learning in face recognition. In: IEEE conference on computer vision and pattern recognition, pp 2691–2698

Download references

Acknowledgements

This work is supported in part by the National Basic Research Program of China (973 program) under Grant 2009CB320901, the National Natural Science Foundation of China under Grant 61272247, the National High Technology Research and Development Program of China (863 program) under Grant 2008AA02Z310, the European Union Seventh Framework Programme under Grant 247619, the Shanghai Committee of Science and Technology under Grant 08411951200, and the Innovation Ability Special Fund of Shanghai Jiao Tong University under Grant Z030026.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, China
Xianzhong Long, Hongtao Lu & Yong Peng
Department of Diagnostic and Interventional Radiology, Affiliated Sixth People’s Hospital, Shanghai Jiao Tong University, Shanghai, China
Wenbin Li

Authors

Xianzhong Long
View author publications
You can also search for this author in PubMed Google Scholar
Hongtao Lu
View author publications
You can also search for this author in PubMed Google Scholar
Yong Peng
View author publications
You can also search for this author in PubMed Google Scholar
Wenbin Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xianzhong Long.

Appendix (Proof of Theorem)

In order to prove Theorem, we need to show that O ₁ is non-increasing under the updating steps in (19), (20) and (21). For the objective function O ₁, we need to fix H and A if we update W, so, the first term of O ₁ exists. Similarly, we need to fix W and H if we update A, the third term of O ₁ exists. Therefore, we have exactly the same update formula for W and A in GDNMF as in the original NMF. Thus, we can use the convergence proof of NMF to show that O ₁ is nonincreasing under the update step in (20) and (21). These details can be found in [14].

Hence, we only need to prove that O ₁ is non-increasing under the updating step in (19). We follow the similar process depicted in [14]. Our proof make use of an auxiliary function similar to that used in the Expectation-Maximization algorithm [8]. We first give the definition of the auxiliary function.

Definition

G(h, h ^′) is an auxiliary function of F(h) if the following conditions are satisfied.

$$ G(h,h^{'})\geq F(h),~~~~ G(h,h)=F(h) $$

(22)

The above auxiliary function is very important because of the following lemma.

Lemma 1

If G is an auxiliary function of F, then F is non-increasing under the update

$$ h^{(t+1)}=\arg\min\limits_{h} G(h, h^{(t)}) $$

(23)

Proof

F(h ^(t + 1)) ≤ G(h ^(t + 1), h ^(t)) ≤ G(h ^(t), h ^(t)) = F(h ^(t))

Now, we show that the update step for H in (19) is exactly the update in (23) with a proper auxiliary function.

Considering any element h _ab in H, we use F _ab to denote the part of O ₁ which is only relevant to h _ab. It is easy to obtain the following derivatives.

$$ F_{ab}^{'}=\left(\dfrac{\partial O_{1}}{\partial\mathbf{H}}\right)_{ab}=[2\mathbf{W}^{T}(\mathbf{W}\mathbf{H}-\mathbf{X})+2\lambda\mathbf{H}(\mathbf{B}-\mathbf{C})+2\gamma\mathbf{A}^{T}(\mathbf{A}\mathbf{H}-\mathbf{S}) ]_{ab} $$

(24)

$$ F_{ab}^{''}=2(\mathbf{W}^{T}\mathbf{W})_{aa}+2\lambda\mathbf{B}_{bb}-2\lambda\mathbf{C}_{bb}+2\gamma(\mathbf{A}^{T}\mathbf{A})_{aa} $$

(25)

It is enough to show that each F _ab is nonincreasing under the update step of (19) because our update is essentially element-wise. Consequently, we introduce the following lemma. □

Lemma 2

Function

$$\begin{array}{rll} G(h,h_{ab}^{(t)})&=&F_{ab}(h_{ab}^{(t)})+F_{ab}^{'}(h_{ab}^{(t)})(h-h_{ab}^{(t)})\\ &&+\,\frac{(\mathbf{W}^{T}\mathbf{W}\mathbf{H}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}+\lambda\mathbf{H}\mathbf{B})_{ab}}{h_{ab}^{(t)}}(h-h_{ab}^{(t)})^2 \end{array}$$

(26)

is an auxiliary function of F _ab.

Proof

We only need to prove that $G(h,h_{ab}^{(t)})\geq F_{ab}(h)$ because G(h,h) = F _ab(h) is obvious. Therefore, we first consider the Taylor series expansion of F _ab(h).

$$\begin{array}{rll} F_{ab}(h)&=&F_{ab}(h_{ab}^{(t)})+F_{ab}^{'}(h_{ab}^{(t)})(h-h_{ab}^{(t)})\\ &&+\,[(\mathbf{W}^{T}\mathbf{W})_{aa}+\lambda\mathbf{B}_{bb}-\lambda\mathbf{C}_{bb}+\gamma(\mathbf{A}^{T}\mathbf{A})_{aa}](h-h_{ab}^{(t)})^{2} \end{array}$$

(27)

We compare the (27) with (26) to find that $G(h,h_{ab}^{(t)})\geq F_{ab}(h)$ is equivalent to

$$\dfrac{(\mathbf{W}^{T}\mathbf{W}\mathbf{H}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}+\lambda\mathbf{H}\mathbf{B})_{ab}}{h_{ab}^{(t)}}\geq (\mathbf{W}^{T}\mathbf{W})_{aa}+\lambda\mathbf{B}_{bb}-\lambda\mathbf{C}_{bb}+\gamma(\mathbf{A}^{T}\mathbf{A})_{aa} $$

(28)

In fact, we have

$$\begin{array}{rll} (\mathbf{W}^{T}\mathbf{W}\mathbf{H}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H})_{ab}&=&\sum\limits_{q=1}^{r}(\mathbf{W}^{T}\mathbf{W})_{aq}h_{qb}^{(t)}+\gamma\sum\limits_{q=1}^{r}(\mathbf{A}^{T}\mathbf{A})_{aq}h_{qb}^{(t)} \\&\geq& (\mathbf{W}^{T}\mathbf{W})_{aa}h_{ab}^{(t)}+\gamma(\mathbf{A}^{T}\mathbf{A})_{aa}h_{ab}^{(t)} \end{array}$$

(29)

and

$$ (\lambda\mathbf{H}\mathbf{B})_{ab}=\lambda\sum\limits_{j=1}^{n}h_{aj}^{(t)}\mathbf{B}_{jb}\geq \lambda h_{ab}^{(t)}\mathbf{B}_{bb}\geq\lambda h_{ab}^{(t)}\mathbf{B}_{bb}-\lambda h_{ab}^{(t)}\mathbf{C}_{bb} $$

(30)

Thus, (28) holds and $G(h,h_{ab}^{(t)})\geq F_{ab}(h)$. We can now demonstrate the convergence of Theorem: □

Proof of Theorem

Replacing $G(h,h_{ab}^{(t)})$ in (23) by (26) results in the following update rule:

$$\begin{array}{rll} h_{ab}^{(t+1)}&=&h_{ab}^{(t)}-h_{ab}^{(t)}\frac{F_{ab}^{'}(h_{ab}^{(t)})}{2(\mathbf{W}^{T}\mathbf{W}\mathbf{H}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}+\lambda\mathbf{H}\mathbf{B})_{ab}}\\ &=&h_{ab}^{(t)}\frac{(\gamma\mathbf{A}^{T}\mathbf{S}+\mathbf{W}^{T}\mathbf{X}+\lambda\mathbf{H}\mathbf{C})_{ab}}{(\mathbf{W}^{T}\mathbf{W}\mathbf{H}+\gamma\mathbf{A}^{T}\mathbf{A}\mathbf{H}+\lambda\mathbf{H}\mathbf{B})_{ab}} \end{array}$$

(31)

Since (26) is an auxiliary function and F _ab is nonincreasing under this update rule. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Long, X., Lu, H., Peng, Y. et al. Graph regularized discriminative non-negative matrix factorization for face recognition. Multimed Tools Appl 72, 2679–2699 (2014). https://doi.org/10.1007/s11042-013-1572-z

Download citation

Published: 06 July 2013
Issue Date: October 2014
DOI: https://doi.org/10.1007/s11042-013-1572-z

Graph regularized discriminative non-negative matrix factorization for face recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multiple Graph Regularized Non-negative Matrix Factorization Based on L2,1 Norm for Face Recognition

Robust automated graph regularized discriminative non-negative matrix factorization

An improve face representation and recognition method based on graph regularized non-negative matrix factorization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix (Proof of Theorem)

Definition

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Graph regularized discriminative non-negative matrix factorization for face recognition

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Multiple Graph Regularized Non-negative Matrix Factorization Based on L2,1 Norm for Face Recognition

Robust automated graph regularized discriminative non-negative matrix factorization

An improve face representation and recognition method based on graph regularized non-negative matrix factorization

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Appendix (Proof of Theorem)

Appendix (Proof of Theorem)

Definition

Lemma 1

Proof

Lemma 2

Proof

Proof of Theorem

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation