[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2389707.2389709acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Moving from descriptive to causal analytics: case study of discovering knowledge from us health indicators warehouse

Published: 29 October 2012 Publication History

Abstract

The knowledge management community has introduced a multitude of methods for knowledge discovery on large datasets. In the context of public health intelligence, we integrated and incorporated some of these methods into an analyst's workflow that proceeds from the data-centric descriptive level of analysis to the model-centric causal level of reasoning. We show several case studies of the proposed analyst's workflow as applied to the US Health Indicators Warehouse (HIW), which is a medium scale, public dataset regarding community health information as collected by the US federal government. In our case studies, we demonstrate a series of visual analytics efforts targeted at the HIW, including visual analysis according to correlation matrices, multivariate outlier analysis, multiple linear regression of Medicare costs, confirmatory factor analysis, and hybrid scatterplot and heatmap visualization for distributions of a group of health indicators. We conclude by sketching a preliminary framework for examining causal dependence hypotheses for future data science research in public health.

References

[1]
Koh, H. C. and Tan, G. 2011. Data mining applications in healthcare. Journal of Healthcare Information Management, 19, 2, 64--72.
[2]
Guo, D., Gahegan, M., MacEachren, A. M. and Zhou, B. 2005. Multivariate analysis and geovisualization with an integrated geographic knowledge discovery approach. Cartogr Geogr Inf Sci., 32, 2, 113--132. DOI= 10.1559/15234040053722150.
[3]
Breault, J. L., Goodall, C. R and Fos, P. J. 2002. Data mining a diabetic data warehouse. Artificial Intelligence in Medicine, 26, 37--54.
[4]
Lavra, N. and Zupan, B. 2005. Data mining in medicine. Chapter 52 in Maimon, O. and Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Part 8, US:Springer, 1107--1137. DOI= 10.1007/0--387--25465-X_52.
[5]
Xu, S., Jewell, B., Steed, C. and Schryver, J. 2012. A new collaborative tool for visually understanding national health indicators. the 4th International Conference on Applied Human Factors and Ergonomics (San Francisco, CA, July 21--25, 2012).
[6]
R Project, http://cran.r-project.org/doc/manuals/R-intro.html
[7]
Friendly, M. 2002. Corrgrams: exploratory displays for correlation matrices. The American Statistician, 56, 4, 316--324. DOI= 10.2307/3087354
[8]
Rajaram, S. and Oono. Y. 2010. NeatMap - non-clustering heat map alternatives in R. BMC Bioinformatics, 11, 45. DOI= 10.1186/1471--2105--11--45.
[9]
Rousseeuw, P., Ruts, I. and Tukey, J. W. 1999. The bagplot: A bivariate boxplot. The American Statistician, 53, 4, 382--387. DOI= 10.2307/2686061
[10]
Rousseeuw, P. J., van Zomeren, B. C. 1990. Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85, 411, 633--639. http://www.jstor.org/stable/2289995
[11]
Chatterjee, S. and Hadi, A. S. 1986. Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 1, 3, 379--416. http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.ss/1177013622
[12]
Davison, A. C., and Sardy, S. 2000. The partial scatterplot matrix. Journal of Computational and Graphical Statistics, 9, 4, 750--758. http://www.highbeam.com/doc/1G1--71187941.html
[13]
Cleveland, W. S. and Devlin, S. J. 1988. Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American Statistical Association, 83, 403, 596--610.
[14]
Kline, R. B. 2011. Principles and Practice of Structural Equation Modeling. (3rd Edition), Guilford Press, New York, NY.
[15]
Pearl, J. 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, UK.
[16]
Adler, N. E. and Rehkopf, D. H. 2008. U. S. disparities in health: descriptions, causes, and mechanisms. Annu. Rev. Public Health, 29, 235--52.
[17]
Bailis, D. S., Segall, A., Mahon, M. J., Chipperfield, J. G. and Dunn, E. M. 2001. Perceived control in relation to socioeconomic and behavioral resources for health. Social Science & Medicine, 52, 1661--1676.
[18]
Mulatu, M. S. and Schooler, C. 2002. Causal connections between socio-economic status and health: reciprocal effects and mediating mechanisms. Journal of Health and Social Behavior, 43, 1, 22--41.
[19]
Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., Lander, E. S., Mitzenmacher, M. and Sabeti, P. C. 2011. Detecting novel associations in large data sets. Science, 334, 1518--1524. DOI= 10.1126/science.1205438.

Cited By

View all
  • (2012)SHB 2012Proceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398756(2762-2763)Online publication date: 29-Oct-2012

Index Terms

  1. Moving from descriptive to causal analytics: case study of discovering knowledge from us health indicators warehouse

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SHB '12: Proceedings of the 2012 international workshop on Smart health and wellbeing
    October 2012
    72 pages
    ISBN:9781450317122
    DOI:10.1145/2389707
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 29 October 2012

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tag

    1. analysis

    Qualifiers

    • Research-article

    Conference

    CIKM'12
    Sponsor:

    Upcoming Conference

    CIKM '25

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)6
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2012)SHB 2012Proceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398756(2762-2763)Online publication date: 29-Oct-2012

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media