[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3416921.3416931acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccbdcConference Proceedingsconference-collections
research-article

Utilizing cost-sensitive machine learning classifiers to identify compounds that inhibit Alzheimer's APP translation

Published: 24 September 2020 Publication History

Abstract

Virtual screening of bioassay data can be of immense benefit to identify compounds which can assist in restricting the production of amyloid beta peptides (Aβ), observed in Alzheimer patients, by inhibiting the translation of amyloid precursor protein (APP). Machine learning classifiers can be adopted on the dataset to investigate those compounds. The ratio of the active molecules that achieve the goal of inhibiting APP, nonetheless, is minimal compared to their inactive counterparts. The imbalance between the two classes is handled by introducing cost-sensitivity to reweight the training instances depending on the misclassification cost allotted to each class. The paper shows the performance of cost-sensitive classifiers (Random Forest, Naive Bayes, and Logistic Regression classifier) to spot the minority (active) molecules from the majority (inactive) classes and shows their evaluation metrics. Sensitivity, specificity, False Negative rate, ROC area, and accuracy are evaluated while keeping the False Positive rate at 20.6%. The aim of the study is to investigate the most reliable classifier for the bioassay data and to explore the ideal misclassification cost. Random Forest classifier was the most robust model compared to Naive Bayes and Logistic Regression Classifiers. Moreover, each classifier had a different optimal misclassification cost.

References

[1]
M. Goedert and M. G. Spillantini, "A century of Alzheimer's disease," Science, vol. 314, no. 5800, pp. 777--781, 2006.
[2]
A. Dement, "Alzheimer's disease facts and figures. Alzheimer's & Dementia," The Journal of the Alzheimer's Association, vol. 12, no. 4, pp. 459--509, 2016.
[3]
S. Gauthier, P. Scheltens and J. Cummings, Alzheimer's Disease and Related Disorders, CRC Press, 2005.
[4]
M. Yarchoan, B. D. James, R. C. Shah, Z. Arvanitakis, R. Wilson, J. Schneider, D. Bennett and S. Arnold, "Association of Cancer History with Alzheimer's Disease Dementia and Neuropathology," J Alzheimers Dis, vol. 56, no. 2, pp. 699--706, 2017.
[5]
D. J. Selkoe, "Alzheimer's disease: genes, proteins, and therapy. Physiological reviews, 81(2), 741- 766.)Forman, G. 2003. An extensive empirical study of feature selection metrics for text classification.," J. Mach. Learn. Res., vol. 3, pp. 1289--1305, 2001.
[6]
Y. Iturria-Medina, F. M. Carbonell, R. C. Sotero, F. Chouinard-Decorte and A. Evans, "Multifactorial causal model of brain (dis) organization and therapeutic intervention: application to Alzheimer's disease," 2017, vol. 152, pp. 60--77, Neuroimage.
[7]
J. A. Hardy and G. A. Higgins, "Alzheimer's disease: the amyloid cascade hypothesis," Science, vol. 256, no. 5054, pp. 184- 186, 1992.
[8]
R. C. Mohs and N. H. Greig, "Drug discovery and development: Role of basic biological research. Alzheimer's & Dementia," Translational Research & Clinical Interventions, vol. 3, no. 4, pp. 651--657, 2017.
[9]
J. A. DiMasi, R. W. Hansen and H. G. Grabowski, "The price of innovation: new estimates of drug development costs," Journal of health economics, vol. 22, no. 2, pp. 151--185, 2003.
[10]
A. C. Schierz, "Virtual screening of bioassay data," Journal of cheminformatics, vol. 1, no. 1, p. 21, 2009.
[11]
G. F. Nordberg, B. A. Fowler and M. Nordberg, Handbook on the Toxicology of Metals, Academic press, 2014.
[12]
M. A. Johnson and G. M. Maggiora, "Concepts and applications of molecular similarity," Vols. Wiley, New York., 1990.
[13]
D. K. Lahiri, Y. W. Ge and B. Maloney, "Characterization of the APP proximal promoter and 5'- untranslated regions: identification of cell type-specific domains and implications in APP gene expression and Alzheimer's disease," The FASEB journal, vol. 19, no. 6, pp. 653--655, 2005.
[14]
K. Liu, J. Feng and S. S. Young, "PowerMV: a software environment for molecular viewing, descriptor generation, data analysis and hit evaluation," Journal of chemical information and modeling, vol. 45, no. 2, pp. 515--522, 2005.
[15]
C. Drummond and R. C. Holte, "Cost curves: An improved method for visualizing classifier performance," Machine learning, vol. 65, no. 1, pp. 95--130, 2006.
[16]
S. Jamal, V. Periwal, V. Scaria and O. S. D. D. Consortium, "Predictive modeling of antimalarial
[17]
E. Frank, M. Hall, L. Trigg, G. Holmes and I. H. Witten, "Data mining in bioinformatics using Weka," Bioinformatics, vol. 20, no. 15, pp. 2479--2481, 2004.
[18]
L. Breiman, "Random forests," Machine learning, vol. 45, no. 1, pp. 5--32, 2001.
[19]
D. D. Lewis, "Naive (Bayes) at forty: The independence assumption in information retrieval," In European conference on machine learning, Springer, Berlin, Heidelberg., pp. 4--15, 1998.
[20]
H. Khalajzadeh, M. Mansouri and M. Teshnehlab, "Face recognition using convolutional neural network and simple logistic classifier.," In: Soft computing in industrial application, Springer, vol. 223, p. 197--207, 2014. molecules inhibiting apicoplast formation," BMC bioinformatics, vol. 14, no. 1, p. 55, 2013.

Cited By

View all
  • (2024)Balancing Imbalanced Toxicity Models: Using MolBERT with Focal LossAI in Drug Discovery10.1007/978-3-031-72381-0_8(82-97)Online publication date: 19-Sep-2024

Index Terms

  1. Utilizing cost-sensitive machine learning classifiers to identify compounds that inhibit Alzheimer's APP translation

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICCBDC '20: Proceedings of the 2020 4th International Conference on Cloud and Big Data Computing
    August 2020
    130 pages
    ISBN:9781450375382
    DOI:10.1145/3416921
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    In-Cooperation

    • Brookes: Oxford Brookes University
    • Staffordshire University: Staffordshire University
    • University of Liverpool

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 24 September 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Alzheimer's Disease
    2. Classification
    3. Cost Sensitivity
    4. Logistic Regression
    5. Naive Bayes
    6. Primary Screen Bioassay
    7. Random Forest

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    ICCBDC '20

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)5
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 02 Mar 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Balancing Imbalanced Toxicity Models: Using MolBERT with Focal LossAI in Drug Discovery10.1007/978-3-031-72381-0_8(82-97)Online publication date: 19-Sep-2024

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media