[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/2462456.2465426acmconferencesArticle/Chapter ViewAbstractPublication PagesmobisysConference Proceedingsconference-collections
research-article

SocioPhone: everyday face-to-face interaction monitoring platform using multi-phone sensor fusion

Published: 25 June 2013 Publication History

Abstract

In this paper, we propose SocioPhone, a novel initiative to build a mobile platform for face-to-face interaction monitoring. Face-to-face interaction, especially conversation, is a fundamental part of everyday life. Interaction-aware applications aimed at facilitating group conversations have been proposed, but have not proliferated yet. Useful contexts to capture and support face-to-face interactions need to be explored more deeply. More important, recognizing delicate conversational contexts with commodity mobile devices requires solving a number of technical challenges. As a first step to address such challenges, we identify useful meta-linguistic contexts of conversation, such as turn-takings, prosodic features, a dominant participant, and pace. These serve as cornerstones for building a variety of interaction-aware applications. SocioPhone abstracts such useful meta-linguistic contexts as a set of intuitive APIs. Its runtime efficiently monitors registered contexts during in-progress conversations and notifies applications on-the-fly. Importantly, we have noticed that online turn monitoring is the basic building block for extracting diverse meta-linguistic contexts, and have devised a novel volume-topography-based method. We show the usefulness of SocioPhone with several interesting applications: SocioTherapist, SocioDigest, and Tug-of-War. Also, we show that our turn-monitoring technique is highly accurate and energy-efficient under diverse real-life situations.

References

[1]
Alpaydin, E. Introduction to Machine Learning, 1st edition. The MIT Press, 2004.
[2]
Anguera, X., Bozonnet, S., Evans, N., Fredouille, C., Friedland, G., and Vinyals, O. Speaker Diarization: A Review of Recent Research. In IEEE Transactions on Audio, Speech, and Language Processing, vol. 20, issue 2, pp. 356--370. 2012.
[3]
Aran, O., and Gatica-Perez, D. Analysis of Group Conversations: Modeling Social Verticality. Computer Analysis of Human Behavior, pp. 293--322. 2011. Springer London.
[4]
Barras, C., Zhu, X., Meihner, S., and Gauvain, J. Multistage Speaker Diarization of Broadcast News. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, Issue 5. 2006.
[5]
Boil, S., Suppression of acoustic noise in speech using spectral subtraction. IEEE Trans. Acoustics, Speech, and Signal Processing. Vol 27, Issue 2, pp. 113--120. 1979.
[6]
Brdiczka, O., Maisonnasse, J., and Reignier, P. Automatic Detection of Interaction Groups, In ICMI, 2005.
[7]
Campbell, J.P., Jr. Speaker recognition: a tutorial. Proc. of the IEEE, Vol. 85, Issue 9, pp. 1437--1462. 1997.
[8]
Chen, J., Benesty, J., Huang, Y., and Doclo, S. New insights into the noise reduction Wiener filter. IEEE Transactions on Audio, Speech, and Language Processing, Vol. 14, Issue 4. 2006
[9]
Choudhury, T, and Pentland, A. Sensing and Modeling Human Networks using the Sociometer. In ISWC, 2003.
[10]
Cowley, S. J. Of timing, Turn-Taking and Conversations, Journal of Psycholinguistic Research, Vol. 27. Nov. 5. 1998.
[11]
Efstratious, C., Leontiadis, I., Picone, M., Rachuri, K. K., Mascolo, C., and Crowcroft, J. Sense and Sensibility in a Pervasive World. In Pervasive, 2012.
[12]
Enck, W., Gilbert, P., Chun, B., Cox, L. P., Jung, J., McDaniel, P., and Sheth, A. N. TaintDroid: An Information-Flow Tracking System for Realtime Privacy, In OSDI, 2010.
[13]
Ford, B., Strauss, J., Lesniewski-Lass, C., Rhea, S., Kaashoek, F., and Morris, R. Persistent Personal Names for Globally Connected Mobile Devices. In OSDI 2006.
[14]
French, N. R. and Steinberg, J. C. Factors Governing the Intellligibility of Speech Sounds. Journal of the Acoustical Society of America, vol. 19, no. 1, pp.90--119. 1947.
[15]
Goffman, E. The Interaction Order. American Sociological Review, vol. 48, pp. 1--17. 1983.
[16]
Hawkins, K. Some Consequences of Deep Interruption in Task-oriented Communication. In Journal of Language and Social Psychology, vol. 10, no. 3, pp. 185--203. 1991.
[17]
Hung, H., Huang, Y., Friedland, G., Gatica-Perez, D. Estimating Dominance in Multi-Party Meetings Using Speaker Diarization. In IEEE Trans. Audio, Speech, and Language Processing, vol. 19, no. 4. 2011.
[18]
Hwang, I., Jang, H., Nachman, L., and Song, J. Exploring Inter-child Behavioral Relativity in a Shared Social Setting: a Field Study in a Kindergarten. In UbiComp 2010.
[19]
Ju, Y., Lee, Y., Yu, J., Min, C., Shin, I., and Song, J. SymPhoney: A Coordinated Sensing Flow Execution Engine for Concurrent Mobile Sensing Applications, in SenSys, 2012.
[20]
Kang, S., Lee, Y., Min, C., Ju, Y., Park, T., Lee, J., Rhee, Y., and Song, J. Orchestrator: An Active Resource Orchestration Framework for Mobile Context Monitoring in Sensor-rich Mobile Environments, in PerCom, 2010.
[21]
Kim, C., and Stern, R. M. Robust Signal-to-Noise Ratio Estimation Based on Waveform Amplitude Distribution Analysis. In InterSpeech, 2008.
[22]
Kim, T., Chang, A., Holland, L., and Pentland, A. Meeting Mediator: Enhancing Group Collaboration using Sociometric Feedback. In CSCW, 2008
[23]
Klasnja, P., Consolvo, S., Choudhury, T., Beckwith, R., and Hightower, J. Exploring Privacy Concerns about Personal Sensing. In Pervasive 2009.
[24]
Koegel, R. L., O'Dell, M. C., and Koegel, L. K. A Natural Language Teaching Paradigm for Nonverbal Autistic Children. Journal of Autism and Developmental Disorders, vol. 17, no. 2, pp. 187--200, 1987.
[25]
Lee, Y., Iyengar, S. S., Min, C., Ju, Y., Park, T., Lee, J., Rhee, Y., Song, J. MobiCon: Mobile Context Monitoring Platform, in Communications of ACM (CACM), 2012.
[26]
Lee, Y., Ju, Y., Min, C., Kang, S., Hwang, I., and Song, J. CoMon: Cooperative Ambience Monitoring Platform with continuity and benefit awareness. In Mobisys, 2012.
[27]
Lu, H., Brush, A. J. B., Priyantha, B., Karson, A. K., and Liu, J. SpeakerSense: Energy Efficient Unobtrusive Speaker Identification on Mobile Phones. In Pervasive, 2011.
[28]
Lu, H., Pan, W., Lane, N. D., Choundhury, T., and Campbell, A. T. SoundSense: Scalable Sound Sensing for People-Centric Application on Mobile Phones. In MobiSys, 2009.
[29]
Lu, H., Yang, J., Liu, Z., Lane, N. D. Choudhury, T., and Campbell, A. T. The Jigsaw continuous sensing engine for mobile phone applications. In SenSys, 2010.
[30]
Miluzzo, E., Papandrea, M., Lane, N. D., Lu, H., and Campbell, A. T. Pocket, Bag, Hand, etc. -- Automatically Detecting Phone Context through Discovery. In PhoneSense 2010.
[31]
Miluzzo. E., Cornelius, C. T., Ramaswamy, A., Choudhury, T., Liu, Z., Campbell, A. T. Darwin Phones: the Evolution of Sensing and Inference on Mobile Phones. In MobiSys, 2011.
[32]
Mundy, P., Sigman, M., Ungerer, J. and Sherman, T. Defining the Social Deficits of Autism: The Contribution of Non-verbal Communication Measures. Journal of Child Psychology and Psychiatry, vol. 27, no. 5, 1986.
[33]
Olguin, D, O., Waber, B. N., Kim, T., Mohan, A., Ara, K., and Pentland, A. Sensible Organizations: Technology and Methodology for Automatically Measuring Organizational Behavior. In IEEE Transactions on Systems, Man, and Cybernetics, Vol. 39, Issue 1, pp. 43--55. 2009.
[34]
Park, T., Lee, J., Hwang, I., Yoo, C., Nachman, L., and Song, J. E-Gesture: A Collaborative Architecture for Energy-efficient Gesture Recognition with Hand-worn Sensor and Mobile Devices, In SenSys, 2011.
[35]
Sanchez-Cortes, D., Aran, O., Mast, M. S., and Gatica-Perez, D. Identifying emergent leadership in small groups using nonverbal communicative cues. In ICMI 2010.
[36]
Sellen, A., and Whittaker, S. Beyond Total Capture: A Constructive Critique of Lifelogging. Communications of the ACM, vol. 53, no. 5, pp. 70--77. May 2010.
[37]
Sohn, J., Kim, N. and Sung, W. Statistical model-based voice activity detection. IEEE Signal Processing Letters, Vol. 6, Issue 1, pp. 1--3. 1999.
[38]
Wang, D. and Narayanan, S. S. Robust Speech Rate Estimation for Spontaneous Speech. In IEEE Transactions on Audio, Speech, and Language Processing, Vol.15, Issue 8. Pp. 2190--2201. 2007.
[39]
Wrigley, S. N., Brown, G. J., Wan, V., and Renals, S. Speech and Crosstalk Detection in Multichannel Audio. IEEE Transactions on Speech and Audio Processing, Vol. 13, Issue 1, pp. 84--91. 2005.
[40]
Wyatt, D., Choudhury, T., Bilmes, J., and Kitts, J. A. Inferring Colocation and Conversation Networks from Privacy-sensitive Audio with Implications for Computational Social Science. ACM Trans. Intelligent Systems and Technology, vol. 2, 2011.

Cited By

View all
  • (2024)Malicious Attacks against Multi-Sensor Fusion in Autonomous DrivingProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649372(436-451)Online publication date: 29-May-2024
  • (2024)Open Sesame? Open Salami! Personalizing Vocabulary Assessment-Intervention for Children via Pervasive Profiling and Bespoke Storybook GenerationProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642580(1-32)Online publication date: 11-May-2024
  • (2024)The Social Journal: Investigating Technology to Support and Reflect on Social InteractionsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642411(1-18)Online publication date: 11-May-2024
  • Show More Cited By

Index Terms

  1. SocioPhone: everyday face-to-face interaction monitoring platform using multi-phone sensor fusion

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      MobiSys '13: Proceeding of the 11th annual international conference on Mobile systems, applications, and services
      June 2013
      568 pages
      ISBN:9781450316729
      DOI:10.1145/2462456
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      In-Cooperation

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 June 2013

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. conversation
      2. face-to-face
      3. interaction
      4. mobile
      5. platform
      6. social
      7. volume topography

      Qualifiers

      • Research-article

      Conference

      MobiSys'13
      Sponsor:

      Acceptance Rates

      MobiSys '13 Paper Acceptance Rate 33 of 211 submissions, 16%;
      Overall Acceptance Rate 274 of 1,679 submissions, 16%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)98
      • Downloads (Last 6 weeks)9
      Reflects downloads up to 31 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Malicious Attacks against Multi-Sensor Fusion in Autonomous DrivingProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649372(436-451)Online publication date: 29-May-2024
      • (2024)Open Sesame? Open Salami! Personalizing Vocabulary Assessment-Intervention for Children via Pervasive Profiling and Bespoke Storybook GenerationProceedings of the CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642580(1-32)Online publication date: 11-May-2024
      • (2024)The Social Journal: Investigating Technology to Support and Reflect on Social InteractionsProceedings of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613904.3642411(1-18)Online publication date: 11-May-2024
      • (2022)YONO: Modeling Multiple Heterogeneous Neural Networks on Microcontrollers2022 21st ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN)10.1109/IPSN54338.2022.00030(285-297)Online publication date: May-2022
      • (2022)Recognition of interactive human groups from mobile sensing dataComputer Communications10.1016/j.comcom.2022.04.028191(208-216)Online publication date: Jul-2022
      • (2021)Synchronized Data Collection for Human Group RecognitionSensors10.3390/s2121709421:21(7094)Online publication date: 26-Oct-2021
      • (2021)HivemindProceedings of the 19th Annual International Conference on Mobile Systems, Applications, and Services10.1145/3458864.3466626(467-482)Online publication date: 24-Jun-2021
      • (2020)Speech Discrimination in Real-World Group Communication Using Audio-Motion Multimodal SensingSensors10.3390/s2010294820:10(2948)Online publication date: 22-May-2020
      • (2020)MAMAS: Supporting Parent--Child Mealtime Interactions Using Automated Tracking and Speech RecognitionProceedings of the ACM on Human-Computer Interaction10.1145/33928764:CSCW1(1-32)Online publication date: 29-May-2020
      • (2020)Group Behavior RecognitionHuman Behavior Analysis: Sensing and Understanding10.1007/978-981-15-2109-6_6(139-218)Online publication date: 1-Mar-2020
      • Show More Cited By

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media