Abstract
In this paper, some metric learning algorithms are used to predict the molecular substructure from mass spectral features. Among them are Discriminative Component Analysis (DCA), Large Margin NN Classifier (LMNN), Information-Theoretic Metric Learning (ITML), Principal Component Analysis (PCA), Multidimensional Scaling (MDS) and Isometric Mapping (ISOMAP). The experimental results show metric learning algorithms achieved better prediction performance than the algorithms based on Elucidation distance. Contrasting to other metric learning algorithms, LMNN is the best one in eleven substructure prediction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Denkert, C., Budczies, J., et al.: Mass Spectrometry-Based Metabolic Profiling Reveals Different Metabolite Patterns in Invasive Ovarian Carcinomas and Ovarian Borderline Tumors. Cancer Research 66(22), 10795–10804 (2006)
Horai, H., Arita, M., et al.: Massbank: A Public Repository for Sharing Mass Spectral Data for Life Sciences. Journal of Mass Spectrometry 45(7), 703–714 (2010)
Schauer, N., Steinhauser, D., et al.: GC-MS Libraries for The Rapid Identification of Metabolites in Complex Biological Samples. Febs Letters 579(6), 1332–1337 (2005)
NIST Mass Spectral Search for the NIST/EPA/NIH mass spectral library version 2.0. office of the Standard Reference Data Base, National Institute of Standards and Technology, Gaithersburg, Maryland (2005)
Stein, S.: Mass Spectral Reference Libraries: An Ever-Expanding Resource for Chemical Identification. Analytical Chemistry 84(17), 7274–7282 (2012)
Stein, S.E., Scott, D.R.: Optimization and Testing of Mass Spectral Library Search Algorithms for Compound Identification. Journal of the American Society for Mass Spectrometry 5(9), 859–866 (1994)
McLafferty, F.W., Zhang, M.Y., Stauffer, D.B., Loh, S.Y.: Comparison of Algorithms and Databases for Matching Unknown Mass Spectra. J. Am. Soc. Mass Spectrom. 9(1), 92–95 (1998)
Hertz, H.S., Hites, R.A., Biemann, K.: Identification of Mass Spectra by Computer-Searching a File of Known Spectra. Analytical Chemistry 43(6), 681 (1971)
Koo, I., Zhang, X., Kim, S.: Wavelet- and Fourier-Transform-Based Spectrum Similarity Approaches to Compound Identification in Gas Chromatography/Mass Spectrometry. Anal Chem. 83(14), 5631–5638 (2011)
Kim, S., Koo, I., Wei, X.L., Zhang, X.: A Method of Finding Optimal Weight Factors for Compound Identification in Gas Chromatography-Mass Spectrometry. Bioinformatics 28(8), 1158–1163 (2012)
Varmuza, K., Werther, W.: Mass Spectral Classifiers for Supporting Systematic Structure Elucidation. Journal of Chemical Information and Computer Sciences 36(2), 323–333 (1996)
Yoshida, H., Leardi, R., Funatsu, K., Varmuza, K.: Feature Selection by Genetic Algorithms for Mass Spectral Classifiers. Analytica Chimica Acta 446(1-2), 485–494 (2001)
Eghbaldar, A., Forrest, T.P., Cabrol-Bass, D.: Development of Neural Networks for Identification of Structural Features From Mass Spectral Data. Analytica Chimica Acta 359(3), 283–301 (1998)
Xiong, Q., Zhang, Y.X., Li, M.L.: Computer-Assisted Prediction of Pesticide Substructure Using Mass Spectra. Analytica Chimica Acta 593(2), 199–206 (2007)
He, P., Xu, C.J., Liang, Y.Z., Fang, K.T.: Improving The Classification Accuracy in Chemistry Via Boosting Technique. Chemometrics and Intelligent Laboratory Systems 70(1), 39–46 (2004)
Xing, E.P., Jordan, M.I., Russell, S., Ng, A.: Distance Metric Learning with Application To Clustering with Side-Information. Advances in Neural Information Processing Systems (2002)
Blitzer, J., Weinberger, K.Q., Saul, L.K.: Distance Metric Learning for Large Margin Nearest Neighbor Classification. Advances in Neural Information Processing Systems (2005)
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I.S.: Information-Theoretic Metric Learning. In: Proceedings of The 24th International Conference on Machine Learning. ACM (2007)
Bar-Hillel, A., Hertz, T., Shental, N., Weinshall, D.: Learning Distance Functions Using Equivalence Relations. In: ICML (2003)
Domeniconi, C., Gunopulos, D., Peng, J.: Large Margin Nearest Neighbor Classifiers. IEEE Transactions on Neural Networks 16(4), 899–909 (2005)
Schölkopf, B., Smola, A., Müller, K.-R.: Nonlinear Component Analysis As A Kernel Eigenvalue Problem. Neural Computation 10(5), 1299–1319 (1998)
Duchene, J., Leclercq, S.: An Optimal Transformation for Discriminant and Principal Component Analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 10(6), 978–983 (1988)
MacEachren, A.M., Davidson, J.V.: Sampling and Isometric Mapping of Continuous Geographic Surfaces. The American Cartographer 14(4), 299–320 (1987)
Ding, G.: The Isometric Extension Problem in The Unit Spheres of Lp (Г)(P> 1) Type Spaces. Science in China Series A: Mathematics 46(3), 333–338 (2003)
Clements, J.C., Leon, L.: A Fast, Accurate Algorithm for The Isometric Mapping of A Developable Surface. SIAM Journal on Mathematical Analysis 18(4), 966–971 (1987)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhang, ZS., Cao, LL., Zhang, J. (2014). Prediction of Molecular Substructure Using Mass Spectral Data Based on Metric Learning. In: Huang, DS., Han, K., Gromiha, M. (eds) Intelligent Computing in Bioinformatics. ICIC 2014. Lecture Notes in Computer Science(), vol 8590. Springer, Cham. https://doi.org/10.1007/978-3-319-09330-7_30
Download citation
DOI: https://doi.org/10.1007/978-3-319-09330-7_30
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-09329-1
Online ISBN: 978-3-319-09330-7
eBook Packages: Computer ScienceComputer Science (R0)