[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ Skip to main content

Advertisement

Log in

Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem

  • Published:
Journal of Global Optimization Aims and scope Submit manuscript

Abstract

In this paper, we consider the symmetric multi-type non-negative matrix tri-factorization problem (SNMTF), which attempts to factorize several symmetric non-negative matrices simultaneously. This can be considered as a generalization of the classical non-negative matrix tri-factorization problem and includes a non-convex objective function which is a multivariate sixth degree polynomial and a has convex feasibility set. It has a special importance in data science, since it serves as a mathematical model for the fusion of different data sources in data clustering. We develop four methods to solve the SNMTF. They are based on four theoretical approaches known from the literature: the fixed point method (FPM), the block-coordinate descent with projected gradient (BCD), the gradient method with exact line search (GM-ELS) and the adaptive moment estimation method (ADAM). For each of these methods we offer a software implementation: for the former two methods we use Matlab and for the latter Python with the TensorFlow library. We test these methods on three data-sets: the synthetic data-set we generated, while the others represent real-life similarities between different objects. Extensive numerical results show that with sufficient computing time all four methods perform satisfactorily and ADAM most often yields the best mean square error (MSE). However, if the computation time is limited, FPM gives the best MSE because it shows the fastest convergence at the beginning. All data-sets and codes are publicly available on our GitLab profile.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

All the data-sets used in this paper and the Matlab and Python codes for the algorithms presented in this paper are available on our GitLab portal [22].

References

  1. Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: Large-scale machine learning on heterogeneous systems (2015). http://tensorflow.org/. Software available from tensorflow.org

  2. Altschul, S.F., Gish, W., Miller, W., Myers, E.W., Lipman, D.J.: Basic local alignment search tool. J. Mol. Biol. 215(3), 403–410 (1990)

    Article  Google Scholar 

  3. Asadi, S., Povh, J.: A block coordinate descent-based projected gradient algorithm for orthogonal non-negative matrix factorization. Mathematics 9(5), 540 (2021)

    Article  Google Scholar 

  4. Atwood, G.R., Foster, W.W.: Transformation of bounded variables in simplex optimization techniques. Ind. Eng. Chem. Process Des. Dev. 12(4), 485–486 (1973)

    Article  Google Scholar 

  5. Bergstra, J., Bengio, Y.: Random search for hyper-parameter optimization. J. Mach. Learn. Res. 13, 281–305 (2012)

    MathSciNet  MATH  Google Scholar 

  6. Bertsekas, D.: Nonlinear Programming. Athena scientific optimization and computation series. Athena Scientific (2016). https://books.google.si/books?id=TwOujgEACAAJ

  7. Boutsidis, C., Gallopoulos, E.: Svd based initialization: a head start for nonnegative matrix factorization. Pattern Recogn. 41(4), 1350–1362 (2008)

    Article  Google Scholar 

  8. Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.I.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation. Wiley, London (2009)

    Book  Google Scholar 

  9. Davis, D., Drusvyatskiy, D., Kakade, S., Lee, J.D.: Stochastic subgradient method converges on tame functions. Found. Comput. Math. 20(1), 119–154 (2020)

    Article  MathSciNet  Google Scholar 

  10. Del Buono, N., Pio, G.: Non-negative matrix tri-factorization for co-clustering: an analysis of the block matrix. Inf. Sci. 301, 13–26 (2015). https://doi.org/10.1016/j.ins.2014.12.058

    Article  Google Scholar 

  11. Dickinson, P.J., Gijben, L.: On the computational complexity of membership problems for the completely positive cone and its dual. Comput. Optim. Appl. 57(2), 403–415 (2014)

    Article  MathSciNet  Google Scholar 

  12. Ding, C., Li, T., Peng, W., Park, H.: Orthogonal nonnegative matrix t-factorizations for clustering. In: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 126–135. ACM (2006)

  13. Feng, S., Krim, H., Kogan, I.: 3D face recognition using euclidean integral invariants signature. In: IEEE/SP 14th Workshop on Statistical Signal Processing, 2007. SSP ’07, pp. 156–160 (2007)

  14. Furini, F., Traversi, E., Belotti, P., Frangioni, A., Gleixner, A., Gould, N., Liberti, L., Lodi, A., Misener, R., Mittelmann, H., et al.: Qplib: a library of quadratic programming instances. Math. Program. Comput. 11(2), 237–265 (2019)

    Article  MathSciNet  Google Scholar 

  15. Gillis, N.: The why and how of nonnegative matrix factorization. In: Suykens, J., Signoretto, M., Argyriou, A. (eds.) Regularization, Optimization, Kernels, and Support Vector Machines, pp. 257–291. Chapman & Hall/CRC, New York (2015)

    Google Scholar 

  16. Gligorijević, V., Janjić, V., Pržulj, N.: Integration of molecular network data reconstructs gene ontology. Bioinformatics 30(17), i594–i600 (2014)

    Article  Google Scholar 

  17. Gligorijević, V., Malod-Dognin, N., Pržulj, N.: Fuse: multiple network alignment via data fusion. Bioinformatics 32(8), 1195–1203 (2015)

    Article  Google Scholar 

  18. Gligorijević, V., Malod-Dognin, N., Pržulj, N.: Integrative methods for analyzing big data in precision medicine. Proteomics 16(5), 741–758 (2016)

    Article  Google Scholar 

  19. Gligorijević, V., Malod-Dognin, N., Pržulj, N.: Patient-specific data fusion for cancer stratification and personalised treatment. In: Proceedings of the Pacific Symposium Biocomputing, pp. 321–332. World Scientific (2016)

  20. Ho, N.D.: Nonnegative matrix factorization algorithms and applications. Ph.D. thesis, Université catholique de Louvain (2008)

  21. Hofmann, T., Buhmann, J.M.: Pairwise data clustering by deterministic annealing. IEEE Trans. Pattern Anal. Mach. Intell. 19(1), 1–14 (1997)

    Article  Google Scholar 

  22. Hrga, T., Hribar, R., Povh, J.: Symmetric NMTF (2020). https://repo.ijs.si/hribarr/symmetric-nmtf

  23. Huang, K., Sidiropoulos, N.D., Swami, A.: Non-negative matrix factorization revisited: uniqueness and algorithm for symmetric decomposition. IEEE Trans. Signal Process. 62(1), 211–224 (2013)

    Article  MathSciNet  Google Scholar 

  24. Jain, A.K., Zongker, D.: Representation and recognition of handwritten digits using deformable templates. IEEE Trans. Pattern Anal. Mach. Intell. 19(12), 1386–1391 (1997)

    Article  Google Scholar 

  25. Kim, H., Park, H.: Nonnegative matrix factorization based on alternating nonnegativity constrained least squares and active set method. SIAM J. Matrix Anal. Appl. 30(2), 713–730 (2008)

    Article  MathSciNet  Google Scholar 

  26. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  27. Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in Neural Information Processing Systems, pp. 556–562 (2001)

  28. Lin, C.J.: Projected gradient methods for nonnegative matrix factorization. Neural Comput. 19(10), 2756–2779 (2007)

    Article  MathSciNet  Google Scholar 

  29. Liu, K., Wang, H.: High-order co-clustering via strictly orthogonal and symmetric l1-norm nonnegative matrix tri-factorization. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 2454–2460. International Joint Conferences on Artificial Intelligence Organization (2018). https://doi.org/10.24963/ijcai.2018/340

  30. Lu, S., Hong, M., Wang, Z.: A nonconvex splitting method for symmetric nonnegative matrix factorization: convergence analysis and optimality. IEEE Trans. Signal Process. 65(12), 3120–3135 (2017)

    Article  MathSciNet  Google Scholar 

  31. Ma, X., Dong, D.: Evolutionary nonnegative matrix factorization algorithms for community detection in dynamic networks. IEEE Trans. Knowl. Data Eng. 29(05), 1045–1058 (2017). https://doi.org/10.1109/TKDE.2017.2657752

    Article  Google Scholar 

  32. Malick, J., Povh, J., Rendl, F., Wiegele, A.: Regularization methods for semidefinite programming. SIAM J. Optim. 20(1), 336–356 (2009)

    Article  MathSciNet  Google Scholar 

  33. Malod-Dognin, N., Petschnigg, J., Windels, S.F., Povh, J., Hemingway, H., Ketteler, R., Pržulj, N.: Towards a data-integrated cell. Nat. Commun. 10(1), 1–13 (2019)

    Article  Google Scholar 

  34. MATLAB: 9.6.0.1072779 (R2019a). The MathWorks Inc., Natick, Massachusetts (2019)

  35. Mirzal, A.: A convergent algorithm for orthogonal nonnegative matrix factorization. J. Comput. Appl. Math. 260, 149–166 (2014)

    Article  MathSciNet  Google Scholar 

  36. Mirzal, A.: A convergent algorithm for bi-orthogonal nonnegative matrix tri-factorization. arXiv preprint arXiv:1710.11478 (2017)

  37. Obayashi, T., Kagaya, Y., Aoki, Y., Tadaka, S., Kinoshita, K.: Coxpresdb v7: a gene coexpression database for 11 animal species supported by 23 coexpression platforms for technical evaluation and evolutionary inference. Nucleic Acids Res. 47(D1), D55–D62 (2019)

    Article  Google Scholar 

  38. Oughtred, R., Stark, C., Breitkreutz, B.J., Rust, J., Boucher, L., Chang, C., Kolas, N., Odonnell, L., Leung, G., McAdam, R., et al.: The biogrid interaction database: 2019 update. Nucleic Acids Res. 47(D1), D529–D541 (2019)

    Article  Google Scholar 

  39. Park, S., Hwang, T.H.: Bayesian semi-nonnegative tri-matrix factorization to identify pathways associated with cancer types. arXiv preprint arXiv:1712.00520 (2017)

  40. Park, S., Kar, N., Cheong, J.H., Hwang, T.H.: Bayesian semi-nonnegative matrix tri-factorization to identify pathways associated with cancer phenotypes. Pacific Symp. Biocomput. 2020, 427–438 (2020). https://doi.org/10.1142/9789811215636_0038

    Article  Google Scholar 

  41. Philips, S., Pitton, J., Atlas, L.: Perceptual feature identification for active sonar echoes. Oceans 2006, 1–6 (2006)

    Google Scholar 

  42. Povh, J., Rendl, F., Wiegele, A.: A boundary point method to solve semidefinite programs. Computing 78(3), 277–286 (2006)

    Article  MathSciNet  Google Scholar 

  43. Pržulj, N., Malod-Dognin, N.: Network analytics in the age of big data. Science 353(6295), 123–124 (2016)

    Article  Google Scholar 

  44. Reddi, S.J., Kale, S., Kumar, S.: On the convergence of Adam and beyond. arXiv preprint arXiv:1904.09237 (2019)

  45. Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)

    Article  Google Scholar 

  46. Saito, S., Hirata, Y., Sasahara, K., Suzuki, H.: Tracking time evolution of collective attention clusters in twitter: time evolving nonnegative matrix factorisation. PLOS ONE 10(9), 1–17 (2015). https://doi.org/10.1371/journal.pone.0139085

    Article  Google Scholar 

  47. Schleif, F., Gisbrecht, A.: Data analysis of (non-)metric proximities at linear costs. In: E.R. Hancock, M. Pelillo (eds.) Similarity-Based Pattern Recognition-Second International Workshop, SIMBAD 2013, York, UK, July 3-5, 2013. Proceedings, Lecture Notes in Computer Science, vol. 7953, pp. 59–74. Springer (2013). https://doi.org/10.1007/978-3-642-39140-8_4

  48. Someya, H., Yamamura, M.: A robust real-coded evolutionary algorithm with toroidal search space conversion. Soft. Comput. 9(4), 254–269 (2005)

    Article  Google Scholar 

  49. Stanfill, C., Waltz, D.: Toward memory-based reasoning. ACM Commun. 29(12), 1213–1228 (1986)

    Article  Google Scholar 

  50. Szklarczyk, D., Gable, A.L., Lyon, D., Junge, A., Wyder, S., Huerta-Cepas, J., Simonovic, M., Doncheva, N.T., Morris, J.H., Bork, P., et al.: String v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47(D1), D607–D613 (2019)

    Article  Google Scholar 

  51. Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475–494 (2001)

    Article  MathSciNet  Google Scholar 

  52. Tsutsui, S.: Multi-parent recombination in genetic algorithms with search space boundary extension by mirroring. In: International Conference on Parallel Problem Solving from Nature, pp. 428–437. Springer (1998)

  53. Vavasis, S.A.: On the complexity of nonnegative matrix factorization. SIAM J. Optim. 20(3), 1364–1377 (2009)

    Article  MathSciNet  Google Scholar 

  54. Čopar, A., Zupan, B., Žitnik, M.: Fast optimization of non-negative matrix tri-factorization. PLoS ONE 14(6), 1–15 (2019). https://doi.org/10.1371/journal.pone.0217994

    Article  Google Scholar 

  55. Wang, F., Li, T., Zhang, C.: Semi-supervised clustering via matrix factorization. In: Proceedings of the 2008 SIAM International Conference on Data Mining, pp. 1–12. SIAM (2008)

  56. Wang, F., Tong, H., Lin, C.: Towards evolutionary nonnegative matrix factorization. In: AAAI-11 / IAAI-11-Proceedings of the 25th AAAI Conference on Artificial Intelligence and the 23rd Innovative Applications of Artificial Intelligence Conference, pp. 501–506 (2011)

  57. Wang, H., Huang, H., Ding, C.: Simultaneous clustering of multi-type relational data via symmetric nonnegative matrix tri-factorization. In: Proceedings of the 20th ACM international conference on Information and knowledge management, pp. 279–284. ACM (2011)

  58. Wang, H., Huang, H., Ding, C., Nie, F.: Predicting protein-protein interactions from multimodal biological data sources via nonnegative matrix tri-factorization. J. Comput. Biol. 20(4), 344–358 (2013)

    Article  MathSciNet  Google Scholar 

  59. Wright, S.J.: Coordinate descent algorithms. Math. Program. 151(1), 3–34 (2015)

    Article  MathSciNet  Google Scholar 

  60. Yu, W., Wang, W., Jiao, P., Li, X.: Evolutionary clustering via graph regularized nonnegative matrix factorization for exploring temporal networks. Knowl.-Based Syst. 167, 1–10 (2019). https://doi.org/10.1016/j.knosys.2019.01.024

    Article  Google Scholar 

  61. Žitnik, M., Janjić, V., Larminie, C., Zupan, B., Pržulj, N.: Discovering disease-disease associations by fusing systems-level molecular data. Sci. Rep. 3, 3202 (2013)

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the financial support from the Slovenian Research Agency (research core funding No. P2-0098, and projects No. J1-8155, No. PR-07606, No. N1-0071), from the European Research Council (ERC) Consolidator Grant (grant number 770827) and from the Spanish State Research Agency AEI 10.13039/501100011033 (grant number PID2019-105500GB-I00).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rok Hribar.

Ethics declarations

Conflicts of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

1.1 A calculation of gradient

Here we offer a derivation of (24) and (25) presented in Sect. 4.1. We use notation \([X]_{\mu \nu }\) for component of matrix X in row \(\mu \) and column \(\nu \) and \(\delta _{ij}\) for the Kronecker delta. Objective function being differentiated is

$$\begin{aligned}&\mathrm {SE}= \sum _{i=1}^N \Vert R_i - f({\tilde{G}})f({\tilde{S}}_i)f({\tilde{G}})^\top \Vert ^2 = \sum _{i=1}^N \Vert Z_i\Vert ^2 \\&\quad = \sum _{i=1}^N \sum _{\mu ,\nu } \left[ Z_i\right] _{\mu \nu }^2, \\ \end{aligned}$$

where \(Z_i\) is defined in (23). Let us differentiate \(\mathrm {SE}\) with respect to a single component of matrix \({\tilde{S}}_i\).

$$\begin{aligned} \frac{\partial \mathrm {SE}}{\partial [{\tilde{S}}_i]_{\rho \sigma }}&= \sum _{i=1}^N \sum _{\mu ,\nu }\frac{\partial [Z_i]_{\mu \nu }^2}{\partial [{\tilde{S}}_i]_{\rho \sigma }} = -2\sum _{i=1}^N \sum _{\mu ,\nu }[Z_i]_{\mu \nu }\frac{\partial \left[ f({\tilde{G}})f({\tilde{S}}_i)f({\tilde{G}})^\top \right] _{\mu \nu }}{\partial [{\tilde{S}}_i]_{\rho \sigma }} \\&= -2\sum _{i=1}^N\sum _{\mu ,\nu }[Z_i]_{\mu \nu }\frac{\partial }{\partial [{\tilde{S}}_i]_{\rho \sigma }} \sum _{p,r}f\left( [{\tilde{G}}]_{\mu p}\right) f\left( [{\tilde{S}}_i]_{pr}\right) f\left( [{\tilde{G}}]_{\nu r}\right) \\&= -2\sum _{i=1}^N\sum _{\mu ,\nu }[Z_i]_{\mu \nu } \sum _{p,r}f\left( [{\tilde{G}}]_{\mu p}\right) f\left( [{\tilde{G}}]_{\nu r}\right) f'\left( [{\tilde{S}}_i]_{pr}\right) \delta _{\rho p}\delta _{\sigma r} \\&= -2\sum _{i=1}^N\sum _{\mu ,\nu }[Z_i]_{\mu \nu } f\left( [{\tilde{G}}]_{\mu \rho }\right) f\left( [{\tilde{G}}]_{\nu \sigma }\right) f'\left( [{\tilde{S}}_i]_{\rho \sigma }\right) \\&= -2\sum _{i=1}^Nf'\left( [{\tilde{S}}_i]_{\rho \sigma }\right) \left[ f({\tilde{G}})^TZ_if({\tilde{G}})\right] _{\rho \sigma } \end{aligned}$$

This result can be written more compactly as

$$\begin{aligned} \varDelta S_i~=~\nabla _{{\tilde{S}}_i}\mathrm {SE}= -2\sum _{i=1}^Nf'({\tilde{S}}_i)\odot \left( f({\tilde{G}})^TZ_if({\tilde{G}})\right) , \end{aligned}$$

which proves (24).

Using the same steps for \({\tilde{G}}\), we obtain (25).

1.2 B Calculation of polynomial used in the exact line search

Here we show exactly how the polynomial (30) used by GM-ELS is calculated. This polynomial tells how \(\mathrm {SE}\) changes with respect to step size t when making a move in direction of gradient. For sake of brevity we use notation \(X^2=X\odot X\).

$$\begin{aligned} p(t)=\sum _{i=1}^N\Vert R_i-({\tilde{G}}+t\varDelta {\tilde{G}})^2({\tilde{S}}_i+t\varDelta {\tilde{S}})^2({\tilde{G}}+t\varDelta {\tilde{G}})^{2\top }\Vert ^2 =\sum _{j=0}^{12} c_jt^j \end{aligned}$$

To calculate coefficients \(c_j\) we first express how matrix \(Z_i\) changes with respect to t.

$$\begin{aligned} Z_i(t)=R_i-({\tilde{G}}+t\varDelta {\tilde{G}})^2({\tilde{S}}_i+t\varDelta {\tilde{S}}_i)^2({\tilde{G}}+t\varDelta {\tilde{G}})^{2\top }=\sum _{j=0}^6 A_{ij}t^j, \end{aligned}$$

where matrices \(A_{ij}\) can be calculated using simple expansion steps and are equal to

$$\begin{aligned} A_{i0}&=R_i-{\tilde{G}}^2{\tilde{S}}_i^2{\tilde{G}}^{2\top }\\ A_{i1}&=-2{\tilde{G}}^2{\tilde{S}}_i^2({\tilde{G}}\odot \varDelta {\tilde{G}})^\top -2{\tilde{G}}^2({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i){\tilde{G}}^{2\top }-2({\tilde{G}}\odot \varDelta {\tilde{G}}){\tilde{S}}_i^2{\tilde{G}}^{2\top }\\ A_{i2}&=-{\tilde{G}}^2{\tilde{S}}_i^2\varDelta {\tilde{G}}^{2\top }-4{\tilde{G}}^2({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i)({\tilde{G}}\odot \varDelta {\tilde{G}})^\top -{\tilde{G}}^2\varDelta {\tilde{S}}_i^2{\tilde{G}}^{2\top }-\varDelta {\tilde{G}}^2{\tilde{S}}_i^2{\tilde{G}}^{2\top }\\&\quad -4({\tilde{G}}\odot \varDelta {\tilde{G}})({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i){\tilde{G}}^{2\top }-4({\tilde{G}}\odot \varDelta {\tilde{G}}){\tilde{S}}_i^2({\tilde{G}}\odot \varDelta {\tilde{G}})^\top \\ A_{i3}&=-2{\tilde{G}}^2({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i)\varDelta {\tilde{G}}^{2\top }-2{\tilde{G}}^2\varDelta {\tilde{S}}_i^2({\tilde{G}}\odot \varDelta {\tilde{G}})^\top -2({\tilde{G}}\odot \varDelta {\tilde{G}}){\tilde{S}}_i^2\varDelta {\tilde{G}}^{2\top }\\&\quad -8({\tilde{G}}\odot \varDelta {\tilde{G}})({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i)({\tilde{G}}\odot \varDelta {\tilde{G}})^\top -2({\tilde{G}}\odot \varDelta {\tilde{G}})\varDelta {\tilde{S}}_i^2{\tilde{G}}^{2\top }\\&\quad -2\varDelta {\tilde{G}}^2{\tilde{S}}_i^2({\tilde{G}}\odot \varDelta {\tilde{G}})^\top -2\varDelta {\tilde{G}}^2({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i){\tilde{G}}^{2\top }\\ A_{i4}&=-{\tilde{G}}^2\varDelta {\tilde{S}}_i^2\varDelta {\tilde{G}}^{2\top }-4({\tilde{G}}\odot \varDelta {\tilde{G}})({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i)\varDelta {\tilde{G}}^{2\top }-4({\tilde{G}}\odot \varDelta {\tilde{G}})\varDelta {\tilde{S}}_i^2({\tilde{G}}\odot \varDelta {\tilde{G}})^\top \\&\quad -\varDelta {\tilde{G}}^2{\tilde{S}}_i^2\varDelta {\tilde{G}}^{2\top }-4\varDelta {\tilde{G}}^2({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i)({\tilde{G}}\odot \varDelta {\tilde{G}})^\top -\varDelta {\tilde{G}}^2\varDelta {\tilde{S}}_i^2{\tilde{G}}^{2\top }\\ A_{i5}&=-2({\tilde{G}}\odot \varDelta {\tilde{G}})\varDelta {\tilde{S}}_i^2\varDelta {\tilde{G}}^{2\top }-2\varDelta {\tilde{G}}^2({\tilde{S}}_i\odot \varDelta {\tilde{S}}_i)\varDelta {\tilde{G}}^{2\top }-2\varDelta {\tilde{G}}^2\varDelta {\tilde{S}}_i^2({\tilde{G}}\odot \varDelta {\tilde{G}})^\top \\ A_{i6}&=-\varDelta {\tilde{G}}^2\varDelta {\tilde{S}}_i^2\varDelta {\tilde{G}}^{2\top }. \end{aligned}$$

Knowing matrices \(A_{ij}\), the element-wise square of \(Z_i(t)\) is

$$\begin{aligned} Z_i^2(t)=\sum _{j=0}^{12}B_{ij}t^j\\ \end{aligned}$$

with

$$\begin{aligned} B_{ij}=\sum _rA_{ir}\odot A_{i,j-r}. \end{aligned}$$
(35)

equation (35) is an application of equivalence of polynomial multiplication and discrete convolution. Therefore, the sum is over all r values that lead to legal indices for both \(A_{ir}\) and \(A_{i,j-r}\) which are all r between \(\max (0, j-6)\) and \(\min (j, 6)\). Remembering

$$\begin{aligned} p(t)=\sum _{i=1}^N\sum _{\mu ,\nu }\left[ Z_i^2(t)\right] _{\mu \nu }=\sum _{j=0}^{12} c_jt^j \end{aligned}$$

we can now express

$$\begin{aligned} c_j=\sum _{i=1}^N\sum _{\mu ,\nu }\left[ B_{ij}\right] _{\mu \nu }. \end{aligned}$$
(36)

Calculation of coefficients \(c_j\) requires far higher number of matrix multiplications compared to the number needed for gradient calculation. When implemented in TensorFlow calculation of \(c_j\) is approximately 7 times more expensive than gradient calculation.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hribar, R., Hrga, T., Papa, G. et al. Four algorithms to solve symmetric multi-type non-negative matrix tri-factorization problem. J Glob Optim 82, 283–312 (2022). https://doi.org/10.1007/s10898-021-01074-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10898-021-01074-3

Keywords

Navigation