[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3382507.3417974acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
keynote
Public Access

Human-centered Multimodal Machine Intelligence

Published: 22 October 2020 Publication History

Abstract

Multimodal machine intelligence offers enormous possibilities for helping understand the human condition and in creating technologies to support and enhance human experiences [1, 2]. What makes such approaches and systems exciting is the promise they hold for adaptation and personalization in the presence of the rich and vast inherent heterogeneity, variety and diversity within and across people. Multimodal engineering approaches can help analyze human trait (e.g., age), state (e.g., emotion), and behavior dynamics (e.g., interaction synchrony) objectively, and at scale. Machine intelligence could also help detect and analyze deviation in patterns from what is deemed typical. These techniques in turn can assist, facilitate or enhance decision making by humans, and by autonomous systems. Realizing such a promise requires addressing two major lines of, oft intertwined, challenges: creating inclusive technologies that work for everyone while enabling tools that can illuminate the source of variability or difference of interest.
This talk will highlight some of these possibilities and opportunities through examples drawn from two specific domains. The first relates to advancing health informatics in behavioral and mental health [3, 4]. With over 10% of the world's population affected, and with clinical research and practice heavily dependent on (relatively scarce) human expertise in diagnosing, managing and treating the condition, engineering opportunities in offering access and tools to support care at scale are immense. For example, in determining whether a child is on the Autism spectrum, a clinician would engage and observe a child in a series of interactive activities, targeting relevant cognitive, communicative and socio- emotional aspects, and codify specific patterns of interest e.g., typicality of vocal intonation, facial expressions, joint attention behavior. Machine intelligence driven processing of speech, language, visual and physiological data, and combining them with other forms of clinical data, enable novel and objective ways of supporting and scaling up these diagnostics. Likewise, multimodal systems can automate the analysis of a psychotherapy session, including computing treatment quality-assurance measures e.g., rating a therapist's expressed empathy. These technology possibilities can go beyond the traditional realm of clinics, directly to patients in their natural settings. For example, remote multimodal sensing of biobehavioral cues can enable new ways for screening and tracking behaviors (e.g., stress in workplace) and progress to treatment (e.g., for depression), and offer just in time support.
The second example is drawn from the world of media. Media are created by humans and for humans to tell stories. They cover an amazing range of domains'from the arts and entertainment to news, education and commerce and in staggering volume. Machine intelligence tools can help analyze media and measure their impact on individuals and society. This includes offering objective insights into diversity and inclusion in media representations through robustly characterizing media portrayals from an intersectional perspective along relevant dimensions of inclusion: gender, race, gender, age, ability and other attributes, and in creating tools to support change [5,6]. Again this underscores the twin technology requirements: to perform equally well in characterizing individuals regardless of the dimensions of the variability, and use those inclusive technologies to shine light on and create tools to support diversity and inclusion.

References

[1]
Signal Analysis and Interpretation Laboratory https://sail.usc.edu/
[2]
Speech Production and Articulation Knowledge Group https://sail.usc.edu/span/
[3]
Daniel Bone, Chi-Chun Lee, Theodora Chaspari, James Gibson, and Shrikanth Narayanan. Signal Processing and Machine Learning for Mental Health Research and Clinical Applications. IEEE Signal Processing Magazine, 34(5):189--196, September 2017.
[4]
Shrikanth Narayanan and Panayiotis Georgiou. Behavioral Signal Processing: Deriving Human Behavioral Informatics from Speech and Language. Proceedings of IEEE, 101(5):1203 -- 1233, May 2013.
[5]
Tanaya Guha, Che-Wei Huang, Naveen Kumar, Yan Zhu, Shrikanth Narayanan. Gender Representation in Cinematic Content: A Multimodal Approach. In Proceedings of 17th ACM International Conference on Multimodal Interaction(ICMI), 2015
[6]
Anil Ramakrishna, Victor Martínez, Nikolaos Malandrakis, Karan Singla and Shrikanth Narayanan. Linguistic analysis of differences in portrayal of movie characters. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL), 2017.

Index Terms

  1. Human-centered Multimodal Machine Intelligence

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ICMI '20: Proceedings of the 2020 International Conference on Multimodal Interaction
    October 2020
    920 pages
    ISBN:9781450375818
    DOI:10.1145/3382507
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 22 October 2020

    Check for updates

    Author Tags

    1. behavior
    2. computational psychology
    3. diversity and inclusion
    4. emotion
    5. human signals
    6. language
    7. media intelligence
    8. mental health
    9. speech

    Qualifiers

    • Keynote

    Funding Sources

    • Simons Foundation Google
    • NSF
    • NIH

    Conference

    ICMI '20
    Sponsor:
    ICMI '20: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
    October 25 - 29, 2020
    Virtual Event, Netherlands

    Acceptance Rates

    Overall Acceptance Rate 453 of 1,080 submissions, 42%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 243
      Total Downloads
    • Downloads (Last 12 months)36
    • Downloads (Last 6 weeks)9
    Reflects downloads up to 21 Dec 2024

    Other Metrics

    Citations

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media