[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2389707.2389709acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
research-article

Moving from descriptive to causal analytics: case study of discovering knowledge from us health indicators warehouse

Published: 29 October 2012 Publication History

Abstract

The knowledge management community has introduced a multitude of methods for knowledge discovery on large datasets. In the context of public health intelligence, we integrated and incorporated some of these methods into an analyst's workflow that proceeds from the data-centric descriptive level of analysis to the model-centric causal level of reasoning. We show several case studies of the proposed analyst's workflow as applied to the US Health Indicators Warehouse (HIW), which is a medium scale, public dataset regarding community health information as collected by the US federal government. In our case studies, we demonstrate a series of visual analytics efforts targeted at the HIW, including visual analysis according to correlation matrices, multivariate outlier analysis, multiple linear regression of Medicare costs, confirmatory factor analysis, and hybrid scatterplot and heatmap visualization for distributions of a group of health indicators. We conclude by sketching a preliminary framework for examining causal dependence hypotheses for future data science research in public health.

References

[1]
Koh, H. C. and Tan, G. 2011. Data mining applications in healthcare. Journal of Healthcare Information Management, 19, 2, 64--72.
[2]
Guo, D., Gahegan, M., MacEachren, A. M. and Zhou, B. 2005. Multivariate analysis and geovisualization with an integrated geographic knowledge discovery approach. Cartogr Geogr Inf Sci., 32, 2, 113--132. DOI= 10.1559/15234040053722150.
[3]
Breault, J. L., Goodall, C. R and Fos, P. J. 2002. Data mining a diabetic data warehouse. Artificial Intelligence in Medicine, 26, 37--54.
[4]
Lavra, N. and Zupan, B. 2005. Data mining in medicine. Chapter 52 in Maimon, O. and Rokach, L. (eds.) Data Mining and Knowledge Discovery Handbook. Part 8, US:Springer, 1107--1137. DOI= 10.1007/0--387--25465-X_52.
[5]
Xu, S., Jewell, B., Steed, C. and Schryver, J. 2012. A new collaborative tool for visually understanding national health indicators. the 4th International Conference on Applied Human Factors and Ergonomics (San Francisco, CA, July 21--25, 2012).
[6]
R Project, http://cran.r-project.org/doc/manuals/R-intro.html
[7]
Friendly, M. 2002. Corrgrams: exploratory displays for correlation matrices. The American Statistician, 56, 4, 316--324. DOI= 10.2307/3087354
[8]
Rajaram, S. and Oono. Y. 2010. NeatMap - non-clustering heat map alternatives in R. BMC Bioinformatics, 11, 45. DOI= 10.1186/1471--2105--11--45.
[9]
Rousseeuw, P., Ruts, I. and Tukey, J. W. 1999. The bagplot: A bivariate boxplot. The American Statistician, 53, 4, 382--387. DOI= 10.2307/2686061
[10]
Rousseeuw, P. J., van Zomeren, B. C. 1990. Unmasking multivariate outliers and leverage points. Journal of the American Statistical Association, 85, 411, 633--639. http://www.jstor.org/stable/2289995
[11]
Chatterjee, S. and Hadi, A. S. 1986. Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 1, 3, 379--416. http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.ss/1177013622
[12]
Davison, A. C., and Sardy, S. 2000. The partial scatterplot matrix. Journal of Computational and Graphical Statistics, 9, 4, 750--758. http://www.highbeam.com/doc/1G1--71187941.html
[13]
Cleveland, W. S. and Devlin, S. J. 1988. Locally weighted regression: an approach to regression analysis by local fitting. Journal of the American Statistical Association, 83, 403, 596--610.
[14]
Kline, R. B. 2011. Principles and Practice of Structural Equation Modeling. (3rd Edition), Guilford Press, New York, NY.
[15]
Pearl, J. 2000. Causality: Models, Reasoning, and Inference. Cambridge University Press, Cambridge, UK.
[16]
Adler, N. E. and Rehkopf, D. H. 2008. U. S. disparities in health: descriptions, causes, and mechanisms. Annu. Rev. Public Health, 29, 235--52.
[17]
Bailis, D. S., Segall, A., Mahon, M. J., Chipperfield, J. G. and Dunn, E. M. 2001. Perceived control in relation to socioeconomic and behavioral resources for health. Social Science & Medicine, 52, 1661--1676.
[18]
Mulatu, M. S. and Schooler, C. 2002. Causal connections between socio-economic status and health: reciprocal effects and mediating mechanisms. Journal of Health and Social Behavior, 43, 1, 22--41.
[19]
Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., Lander, E. S., Mitzenmacher, M. and Sabeti, P. C. 2011. Detecting novel associations in large data sets. Science, 334, 1518--1524. DOI= 10.1126/science.1205438.

Cited By

View all
  • (2012)SHB 2012Proceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398756(2762-2763)Online publication date: 29-Oct-2012

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
SHB '12: Proceedings of the 2012 international workshop on Smart health and wellbeing
October 2012
72 pages
ISBN:9781450317122
DOI:10.1145/2389707
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 October 2012

Permissions

Request permissions for this article.

Check for updates

Author Tag

  1. analysis

Qualifiers

  • Research-article

Conference

CIKM'12
Sponsor:

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2012)SHB 2012Proceedings of the 21st ACM international conference on Information and knowledge management10.1145/2396761.2398756(2762-2763)Online publication date: 29-Oct-2012

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media