Abstract
Linear least squares is one of the most widely used regression methods among scientists in many fields. The simplicity of the model allows this method to be used when data is scarce and it is usually appealing to practitioners that need to gather some insight into the problem by inspecting the values of the learnt parameters. In this paper we propose a variant of the linear least squares model that allows practitioners to partition the input features into groups of variables that they require to contribute similarly to the final result. We formally show that the new formulation is not convex and provide two alternative methods to deal with the problem: one non-exact method based on an alternating least squares approach; and one exact method based on a reformulation of the problem using an exponential number of sub-problems whose minimum is guaranteed to be the optimal solution. We formally show the correctness of the exact method and also compare the two solutions showing that the exact solution provides better results in a fraction of the time required by the alternating least squares solution (assuming that the number of partitions is small).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
In this informal argument we are assuming that each convex problem requires about the same amount of time to be solved. While this is not guaranteed, we believe that it is very unlikely that deviations from this assumption would lead to situations very different from the ones outlined in the argument.
References
Bezanson, J., Karpinski, S., Shah, V.B., Edelman, A.: Julia: a fast dynamic language for technical computing. CoRR abs/1209.5145 (2012). http://arxiv.org/abs/1209.5145
Caron, G., et al.: A fast chromatographic method for estimating lipophilicity andionization in nonpolar membrane-like environment. Mol. Pharm. 13(3), 1100–1110 (2016). https://doi.org/10.1021/acs.molpharmaceut.5b00910. pMID: 26767433
Ermondi, G., Caron, G.: Molecular interaction fields based descriptors to interpret and compare chromatographic indexes. J. Chromatogr. A 1252, 84–89 (2012). https://doi.org/10.1016/j.chroma.2012.06.069. http://www.sciencedirect.com/science/article/pii/S0021967312009636
Giulia, C., Maura, V., Giuseppe, E.: The block relevance (BR) analysis to aid medicinal chemists to determine and interpret lipophilicity. Med. Chem. Commun. 4, 1376–1381 (2013). https://doi.org/10.1039/C3MD00140G
Goodford, P.J.: A computational procedure for determining energetically favorable binding sites on biologically important macromolecules. J. Med. Chem. 28(7), 849–857 (1985). https://doi.org/10.1021/jm00145a002
Huang, J., Breheny, P., Ma, S.: A selective review of group selection in high-dimensional models. Stat. Sci.: Rev. J. Inst. Math. Stat. 27(4) (2012)
Lipton, Z.: The mythos of model interpretability. Commun. ACM 61 (2016). https://doi.org/10.1145/3233231
Wold, S., Sjöström, M., Eriksson, L.: PLS-regression: a basic tool of chemometrics. Chemom. Intell. Lab. Syst. 58(2), 109–130 (2001). https://doi.org/10.1016/S0169-7439(01)00155-1. http://www.sciencedirect.com/science/article/pii/S0169743901001551. pLS Methods
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Esposito, R., Cerrato, M., Locatelli, M. (2019). Partitioned Least Squares. In: Alviano, M., Greco, G., Scarcello, F. (eds) AI*IA 2019 – Advances in Artificial Intelligence. AI*IA 2019. Lecture Notes in Computer Science(), vol 11946. Springer, Cham. https://doi.org/10.1007/978-3-030-35166-3_13
Download citation
DOI: https://doi.org/10.1007/978-3-030-35166-3_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35165-6
Online ISBN: 978-3-030-35166-3
eBook Packages: Computer ScienceComputer Science (R0)