[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2110363.2110452acmconferencesArticle/Chapter ViewAbstractPublication PagesihiConference Proceedingsconference-collections
short-paper

Combining NLP with evidence-based methods to find text metrics related to perceived and actual text difficulty

Published: 28 January 2012 Publication History

Abstract

Measuring text difficulty is prevalent in health informatics since it is useful for information personalization and optimization. Unfortunately, it is uncertain how best to compute difficulty so that it relates to reader understanding. We aim to create computational, evidence-based metrics of perceived and actual text difficulty. We start with a corpus analysis to identify candidate metrics which are further tested in user studies. Our corpus contains blogs and journal articles (N=1,073) representing easy and difficult text. Using natural language processing, we calculated base grammatical and semantic metrics, constructed new composite metrics (noun phrase complexity and semantic familiarity), and measured the commonly used Flesch-Kincaid grade level. The metrics differed significantly between document types. Nouns were more prevalent but less familiar in difficult text; verbs and function words were more prevalent in easy text. Noun phrase complexity was lower, semantic familiarity was higher and grade levels were lower in easy text. Then, all metrics were tested for their relation to perceived and actual difficulty using follow-up analyses of two user studies conducted earlier. Base metrics and noun phrase complexity correlated significantly with perceived difficulty and could help explain actual difficulty.

References

[1]
Committee on Health Literacy - Institute of Medicine of the National Academies, ed. Health Literacy: A Prescription to End Confusion. ed. N. Nielsen-Bohlman, A. M. Panzer, and D. A. Kindig. 2004, The National Academies Press: Washington, DC.
[2]
Friedland, R. B., Understanding Health Literacy: New Estimates of the Costs of Inadequate Health Literacy, ed. N.A.o.a.A. Society. 1998, Washington, DC.
[3]
Berland, G. K., et al., Health Information on the Internet: Accessibility, Quality, and Readability in English and Spanish. J. Am Med Ass, 2001. 285: 2612--2621.
[4]
D'Alessandro, D., P. Kingsley, and J. Johnson-West, The Readability of Pediatric Patient Education Materials on the World Wide Web. Arch Pediatr Adolesc Med., 2001. 155: 807--812.
[5]
Root, J. and S. Stableford, Easy-to-Read Consumer Communications: A Missing Link in Medicaid Managed Care. Journal of Health Politics, Policy, and Law, 1999. 24: 1--26.
[6]
Baker, L., et al., Use of the Internet and E-mail for Health Care Information: Results from a National Survey. J. Am Med Ass, 2003. 289(18): 2400--2406.
[7]
Weis, B. D., Health Literacy and Patient Safety: Help Patients Understand. Manual for Clinicians. Second Edition ed. 2007: AMA and AMA Foundation.
[8]
Kim, H., et al., Beyond Surface Characteristics: A New Health Text-Specific Readability Measurement, in AMIA Annu Symp Proc., AMIA, Editor. 2007. 418--422.
[9]
Clauson, K. A., Q. Zeng-Treitler, and S. Kandula, Readability of Patient and Health Care Professional Targeted Dietary Supplement Leaflets Used for Diabetes and Chronic Fatigue Syndrome. J Altern Complement Med, 2010. 16(1): 119--124.
[10]
McLaughlin, G. H., SMOG Grading: a New Readability Formula. J of Reading, 1969. 12: 636--646.
[11]
C. A. Weaver, I. and A. Renken, Applied Pscyhology of Readability. International Encyclopedia of the Social & Behavioral Sciences, 2004: 12789--12791.
[12]
Yan, X., D. Song, and X. Li, Concept-based document readability in domain specific information retrieval, in 15th ACM international conference on Information and knowledge management. 2006, ACM: Arlington, Virginia, USA.
[13]
Mazor, K., K. Dodd, and L. Kunches, Communicating Hospital Infection Data to the Public: A Study of Consumer Responses and Preferences. Am J Med Qual., 2009 24(2): 108--115.
[14]
Trifiletti, L. B., et al., Development of Injury Prevention Materials for People with Low Literacy Skills. Patient Education and Counseling, 2006 64(1-3): 119--127.
[15]
Kandula, S., D. Curtis, and Q. Zeng-Treitler. A Semantic and Syntactic Text Simplification Tool for Health Content. in AMIA Annu Symp Proc. 2010.
[16]
Taylor, W. L., Cloze procedure: A new tool for measuring readability. Journalism Quarterly, 1953. 30: 415--433.
[17]
Davis, T. C., et al., The Role of Inadequate Health Literacy Skills in Colorectal Cancer Screening. Cancer Investigation, 2001. 19(2): 193--200.
[18]
Janz, N. K. and M. H. Becker, The Health Belief Model: A Decade Later. Health Education Quart, 1984. 11(1): 1--47.
[19]
Ajzen, I., The Theory of Planned Behavior. Organizational Behavior and Human Decision Processes, 1988. 50: 179--211.
[20]
Trafimow, D., et al., Evidence that Perceived Behavioral Control is a Multidimensional Construct: Perceived Control and Perceived Difficulty. Brit J of Soc Psyc, 2002. 41: 101--121.
[21]
Liu, Y., W. R. Doucette, and K. B. Farris, Perceived Difficulty and Self-Efficacy in the Factor Structure of Perceived Behavioral Control to Seek Drug Information from Physicians and Pharmacists. Research in Social and Administrative Pharmacy, 2007. 3: 145--159.
[22]
Cunningham, H., et al. GATE: A Framework and Graphical Development Environment for Robust NLP Tools and Applications. In 40th Anniversary Meeting of the Association for Computational Linguistics. 2002. Philadelphia.
[23]
Leroy, G., S. Helmreich, and J. Cowie, The Influence of Text Characteristics on Perceived and Actual Difficulty of Health Information. Intern Jof Med Inf, 2010. 79(6): 438--449.
[24]
Miller, T., et al. A Classifier to Evaluate Language Specificity of Medical Documents. in Fortieth Annual Hawaii International Conference on System Sciences. 2007. Waikoloa, Big Island, Hawaii.

Cited By

View all
  • (2024)Text Simplification System for Legal Contract ReviewAdvances in Information and Communication10.1007/978-3-031-53960-2_8(105-123)Online publication date: 21-Mar-2024
  • (2021)Automated Readability Assessment for Spanish e-Government InformationJournal of Information Systems Engineering and Management10.29333/jisem/96206:2(em0137)Online publication date: 2021
  • (2021)Psycholinguistic Markers of COVID-19 Conspiracy Tweets and Predictors of Tweet DisseminationHealth Communication10.1080/10410236.2021.192969138:1(21-30)Online publication date: 20-May-2021
  • Show More Cited By

Index Terms

  1. Combining NLP with evidence-based methods to find text metrics related to perceived and actual text difficulty

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      IHI '12: Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium
      January 2012
      914 pages
      ISBN:9781450307819
      DOI:10.1145/2110363
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 28 January 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. actual difficulty
      2. health informatics
      3. natural language processing
      4. perceived difficulty
      5. readability

      Qualifiers

      • Short-paper

      Conference

      IHI '12
      Sponsor:
      IHI '12: ACM International Health Informatics Symposium
      January 28 - 30, 2012
      Florida, Miami, USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)32
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 18 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Text Simplification System for Legal Contract ReviewAdvances in Information and Communication10.1007/978-3-031-53960-2_8(105-123)Online publication date: 21-Mar-2024
      • (2021)Automated Readability Assessment for Spanish e-Government InformationJournal of Information Systems Engineering and Management10.29333/jisem/96206:2(em0137)Online publication date: 2021
      • (2021)Psycholinguistic Markers of COVID-19 Conspiracy Tweets and Predictors of Tweet DisseminationHealth Communication10.1080/10410236.2021.192969138:1(21-30)Online publication date: 20-May-2021
      • (2020)Readability of Spanish e-government information2020 15th Iberian Conference on Information Systems and Technologies (CISTI)10.23919/CISTI49556.2020.9141000(1-4)Online publication date: Jun-2020
      • (2019)An Informatics Framework to Assess Consumer Health Language Complexity Differences: A Proof-of-Concept Study (Preprint)Journal of Medical Internet Research10.2196/16795Online publication date: 31-Oct-2019
      • (2018)A cloud-based framework for large-scale traditional Chinese medical record retrievalJournal of Biomedical Informatics10.1016/j.jbi.2017.11.01377(21-33)Online publication date: Jan-2018
      • (2017)NegAITJournal of Biomedical Informatics10.1016/j.jbi.2017.03.01469:C(55-62)Online publication date: 1-May-2017
      • (2017)Measuring text difficulty using parse-tree frequencyJournal of the Association for Information Science and Technology10.1002/asi.2385568:9(2088-2100)Online publication date: 1-Sep-2017
      • (2015)Diversity-aware retrieval of medical recordsComputers in Industry10.1016/j.compind.2014.09.00469:C(81-91)Online publication date: 1-May-2015
      • (2014)The effect of word familiarity on actual and perceived text difficultyJournal of the American Medical Informatics Association10.1136/amiajnl-2013-00217221:e1(e169-e172)Online publication date: 1-Feb-2014
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media