Abstract
In this paper, we consider the symmetric multi-type non-negative matrix tri-factorization problem (SNMTF), which attempts to factorize several symmetric non-negative matrices simultaneously. This can be considered as a generalization of the classical non-negative matrix tri-factorization problem and includes a non-convex objective function which is a multivariate sixth degree polynomial and a has convex feasibility set. It has a special importance in data science, since it serves as a mathematical model for the fusion of different data sources in data clustering. We develop four methods to solve the SNMTF. They are based on four theoretical approaches known from the literature: the fixed point method (FPM), the block-coordinate descent with projected gradient (BCD), the gradient method with exact line search (GM-ELS) and the adaptive moment estimation method (ADAM). For each of these methods we offer a software implementation: for the former two methods we use Matlab and for the latter Python with the TensorFlow library. We test these methods on three data-sets: the synthetic data-set we generated, while the others represent real-life similarities between different objects. Extensive numerical results show that with sufficient computing time all four methods perform satisfactorily and ADAM most often yields the best mean square error (MSE). However, if the computation time is limited, FPM gives the best MSE because it shows the fastest convergence at the beginning. All data-sets and codes are publicly available on our GitLab profile.
Similar content being viewed by others
Data availability
All the data-sets used in this paper and the Matlab and Python codes for the algorithms presented in this paper are available on our GitLab portal [22].
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/. Software available from tensorflow.org
Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)
Asadi, S., Povh, J.: A block coordinate descent-based projected gradient algorithm for orthogonal non-negative matrix factorization. Mathematics 9(5), 540 (2021)
Atwood, G.R., Foster, W.W.: Transformation of bounded variables in simplex optimization techniques. Ind. Eng. Chem. Process Des. Dev. 12(4), 485–486 (1973)
Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)
Bertsekas, D.: Nonlinear Programming. Athena scientific optimization and computation series. Athena Scientific (2016). https://books.google.si/books?id=TwOujgEACAAJ
Boutsidis, C., Gallopoulos, E.: Svd based initialization: a head start for nonnegative matrix factorization. Pattern Recogn. 41(4), 1350–1362 (2008)
Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. Wiley, London (2009)
Davis, D., Drusvyatskiy, D., Kakade, S., Lee, J.D.: Stochastic subgradient method converges on tame functions. Found. Comput. Math. 20(1), 119–154 (2020)
Del Buono, N., Pio, G.: Non-negative matrix tri-factorization for co-clustering: an analysis of the block matrix. Inf. Sci. 301, 13–26 (2015). https://doi.org/10.1016/j.ins.2014.12.058
Dickinson, P.J., Gijben, L.: On the computational complexity of membership problems for the completely positive cone and its dual. Comput. Optim. Appl. 57(2), 403–415 (2014)
Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)
Feng, S., Krim, H., Kogan, I.: 3D face recognition using euclidean integral invariants signature. In: IEEE/SP 14th Workshop on Statistical Signal Processing, 2007. SSP ’07, pp. 156–160 (2007)
Furini, F., Traversi, E., Belotti, P., Frangioni, A., Gleixner, A., Gould, N., Liberti, L., Lodi, A., Misener, R., Mittelmann, H., et al.: Qplib: a library of quadratic programming instances. Math. Program. Comput. 11(2), 237–265 (2019)
Gillis, N.: The why and how of nonnegative matrix factorization. In: Suykens, J., Signoretto, M., Argyriou, A. (eds.) Regularization, Optimization, Kernels, and Support Vector Machines, pp. 257–291. Chapman & Hall/CRC, New York (2015)
Gligorijević, V., Janjić, V., Pržulj, N.: Integration of molecular network data reconstructs gene ontology. Bioinformatics 30(17), i594–i600 (2014)
Gligorijević, V., Malod-Dognin, N., Pržulj, N.: Fuse: multiple network alignment via data fusion. Bioinformatics 32(8), 1195–1203 (2015)
Gligorijević, V., Malod-Dognin, N., Pržulj, N.: Integrative methods for analyzing big data in precision medicine. Proteomics 16(5), 741–758 (2016)
Gligorijević, V., Malod-Dognin, N., Pržulj, N.: Patient-specific data fusion for cancer stratification and personalised treatment. In: Proceedings of the Pacific Symposium Biocomputing, pp. 321–332. World Scientific (2016)
Ho, N.D.: Nonnegative matrix factorization algorithms and applications. Ph.D. thesis, Université catholique de Louvain (2008)
Hofmann, T., Buhmann, J.M.: Pairwise data clustering by deterministic annealing. IEEE Trans. Pattern Anal. Mach. Intell. 19(1), 1–14 (1997)
Hrga, T., Hribar, R., Povh, J.: Symmetric NMTF (2020). https://repo.ijs.si/hribarr/symmetric-nmtf
Huang, K., Sidiropoulos, N.D., Swami, A.: Non-negative matrix factorization revisited: uniqueness and algorithm for symmetric decomposition. IEEE Trans. Signal Process. 62(1), 211–224 (2013)
Jain, A.K., Zongker, D.: Representation and recognition of handwritten digits using deformable templates. IEEE Trans. Pattern Anal. Mach. Intell. 19(12), 1386–1391 (1997)
Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008)
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2001)
Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)
Liu, K., Wang, H.: High-order co-clustering via strictly orthogonal and symmetric l1-norm nonnegative matrix tri-factorization. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 2454–2460. International Joint Conferences on Artificial Intelligence Organization (2018). https://doi.org/10.24963/ijcai.2018/340
Lu, S., Hong, M., Wang, Z.: A nonconvex splitting method for symmetric nonnegative matrix factorization: convergence analysis and optimality. IEEE Trans. Signal Process. 65(12), 3120–3135 (2017)
Ma, X., Dong, D.: Evolutionary nonnegative matrix factorization algorithms for community detection in dynamic networks. IEEE Trans. Knowl. Data Eng. 29(05), 1045–1058 (2017). https://doi.org/10.1109/TKDE.2017.2657752
Malick, J., Povh, J., Rendl, F., Wiegele, A.: Regularization methods for semidefinite programming. SIAM J. Optim. 20(1), 336–356 (2009)
Malod-Dognin, N., Petschnigg, J., Windels, S.F., Povh, J., Hemingway, H., Ketteler, R., Pržulj, N.: Towards a data-integrated cell. Nat. Commun. 10(1), 1–13 (2019)
MATLAB: 9.6.0.1072779 (R2019a). The MathWorks Inc., Natick, Massachusetts (2019)
Mirzal, A.: A convergent algorithm for orthogonal nonnegative matrix factorization. J. Comput. Appl. Math. 260, 149–166 (2014)
Mirzal, A.: A convergent algorithm for bi-orthogonal nonnegative matrix tri-factorization. arXiv preprint arXiv:1710.11478 (2017)
Obayashi, T., Kagaya, Y., Aoki, Y., Tadaka, S., Kinoshita, K.: Coxpresdb v7: a gene coexpression database for 11 animal species supported by 23 coexpression platforms for technical evaluation and evolutionary inference. Nucleic Acids Res. 47(D1), D55–D62 (2019)
Oughtred, R., Stark, C., Breitkreutz, B.J., Rust, J., Boucher, L., Chang, C., Kolas, N., Odonnell, L., Leung, G., McAdam, R., et al.: The biogrid interaction database: 2019 update. Nucleic Acids Res. 47(D1), D529–D541 (2019)
Park, S., Hwang, T.H.: Bayesian semi-nonnegative tri-matrix factorization to identify pathways associated with cancer types. arXiv preprint arXiv:1712.00520 (2017)
Park, S., Kar, N., Cheong, J.H., Hwang, T.H.: Bayesian semi-nonnegative matrix tri-factorization to identify pathways associated with cancer phenotypes. Pacific Symp. Biocomput. 2020, 427–438 (2020). https://doi.org/10.1142/9789811215636_0038
Philips, S., Pitton, J., Atlas, L.: Perceptual feature identification for active sonar echoes. Oceans 2006, 1–6 (2006)
Povh, J., Rendl, F., Wiegele, A.: A boundary point method to solve semidefinite programs. Computing 78(3), 277–286 (2006)
Pržulj, N., Malod-Dognin, N.: Network analytics in the age of big data. Science 353(6295), 123–124 (2016)
Reddi, S.J., Kale, S., Kumar, S.: On the convergence of Adam and beyond. arXiv preprint arXiv:1904.09237 (2019)
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)
Saito, S., Hirata, Y., Sasahara, K., Suzuki, H.: Tracking time evolution of collective attention clusters in twitter: time evolving nonnegative matrix factorisation. PLOS ONE 10(9), 1–17 (2015). https://doi.org/10.1371/journal.pone.0139085
Schleif, F., Gisbrecht, A.: Data analysis of (non-)metric proximities at linear costs. In: E.R. Hancock, M. Pelillo (eds.) Similarity-Based Pattern Recognition-Second International Workshop, SIMBAD 2013, York, UK, July 3-5, 2013. Proceedings, Lecture Notes in Computer Science, vol. 7953, pp. 59–74. Springer (2013). https://doi.org/10.1007/978-3-642-39140-8_4
Someya, H., Yamamura, M.: A robust real-coded evolutionary algorithm with toroidal search space conversion. Soft. Comput. 9(4), 254–269 (2005)
Stanfill, C., Waltz, D.: Toward memory-based reasoning. ACM Commun. 29(12), 1213–1228 (1986)
Szklarczyk, D., Gable, A.L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., Simonovic, M., Doncheva, N.T., Morris, J.H., Bork, P., et al.: String v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47(D1), D607–D613 (2019)
Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475–494 (2001)
Tsutsui, S.: Multi-parent recombination in genetic algorithms with search space boundary extension by mirroring. In: International Conference on Parallel Problem Solving from Nature, pp. 428–437. Springer (1998)
Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20(3), 1364–1377 (2009)
Čopar, A., Zupan, B., Žitnik, M.: Fast optimization of non-negative matrix tri-factorization. PLoS ONE 14(6), 1–15 (2019). https://doi.org/10.1371/journal.pone.0217994
Wang, F., Li, T., Zhang, C.: Semi-supervised clustering via matrix factorization. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 1–12. SIAM (2008)
Wang, F., Tong, H., Lin, C.: Towards evolutionary nonnegative matrix factorization. In: AAAI-11 / IAAI-11-Proceedings of the 25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, pp. 501–506 (2011)
Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. In: Proceedings of the 20th ACM international conference on Information and knowledge management, pp. 279–284. ACM (2011)
Wang, H., Huang, H., Ding, C., Nie, F.: Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization. J. Comput. Biol. 20(4), 344–358 (2013)
Wright, S.J.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)
Yu, W., Wang, W., Jiao, P., Li, X.: Evolutionary clustering via graph regularized nonnegative matrix factorization for exploring temporal networks. Knowl.-Based Syst. 167, 1–10 (2019). https://doi.org/10.1016/j.knosys.2019.01.024
Žitnik, M., Janjić, V., Larminie, C., Zupan, B., Pržulj, N.: Discovering disease-disease associations by fusing systems-level molecular data. Sci. Rep. 3, 3202 (2013)
Acknowledgements
The authors acknowledge the financial support from the Slovenian Research Agency (research core funding No. P2-0098, and projects No. J1-8155, No. PR-07606, No. N1-0071), from the European Research Council (ERC) Consolidator Grant (grant number 770827) and from the Spanish State Research Agency AEI 10.13039/501100011033 (grant number PID2019-105500GB-I00).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix
Appendix
1.1 A calculation of gradient
Here we offer a derivation of (24) and (25) presented in Sect. 4.1. We use notation \([X]_{\mu \nu }\) for component of matrix X in row \(\mu \) and column \(\nu \) and \(\delta _{ij}\) for the Kronecker delta. Objective function being differentiated is
where \(Z_i\) is defined in (23). Let us differentiate \(\mathrm {SE}\) with respect to a single component of matrix \({\tilde{S}}_i\).
This result can be written more compactly as
which proves (24).
Using the same steps for \({\tilde{G}}\), we obtain (25).
1.2 B Calculation of polynomial used in the exact line search
Here we show exactly how the polynomial (30) used by GM-ELS is calculated. This polynomial tells how \(\mathrm {SE}\) changes with respect to step size t when making a move in direction of gradient. For sake of brevity we use notation \(X^2=X\odot X\).
To calculate coefficients \(c_j\) we first express how matrix \(Z_i\) changes with respect to t.
where matrices \(A_{ij}\) can be calculated using simple expansion steps and are equal to
Knowing matrices \(A_{ij}\), the element-wise square of \(Z_i(t)\) is
with
equation (35) is an application of equivalence of polynomial multiplication and discrete convolution. Therefore, the sum is over all r values that lead to legal indices for both \(A_{ir}\) and \(A_{i,j-r}\) which are all r between \(\max (0, j-6)\) and \(\min (j, 6)\). Remembering
we can now express
Calculation of coefficients \(c_j\) requires far higher number of matrix multiplications compared to the number needed for gradient calculation. When implemented in TensorFlow calculation of \(c_j\) is approximately 7 times more expensive than gradient calculation.
Rights and permissions
About this article
Cite this article
Hribar, R., Hrga, T., Papa, G. et al. Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem. J Glob Optim 82, 283–312 (2022). https://doi.org/10.1007/s10898-021-01074-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-021-01074-3