[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2021975.2021992guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

SpeakerSense: energy efficient unobtrusive speaker identification on mobile phones

Published: 12 June 2011 Publication History

Abstract

Automatically identifying the person you are talking with using continuous audio sensing has the potential to enable many pervasive computing applications from memory assistance to annotating life logging data. However, a number of challenges, including energy efficiency and training data acquisition, must be addressed before unobtrusive audio sensing is practical on mobile devices. We built SpeakerSense, a speaker identification prototype that uses a heterogeneous multi-processor hardware architecture that splits computation between a low power processor and the phone's application processor to enable continuous background sensing with minimal power requirements. Using SpeakerSense, we benchmarked several system parameters (sampling rate, GMM complexity, smoothing window size, and amount of training data needed) to identify thresholds that balance computation cost with performance. We also investigated channel compensation methods that make it feasible to acquire training data from phone calls and an automatic segmentation method for training speaker models based on one-to-one conversations.

References

[1]
Hayes, G., Patel, S., Truong, K., Iachello, G., Kientz, J., Farmer, R., Abowd, G.: The Personal Audio Loop: Designing a Ubiquitous Audio-Based Memory Aid. In: Proc. Mobile HCI 2004 (2004).
[2]
Hodges, S., Williams, L., Berry, E., Izadi, S., Srinivasan, J., Butler, A., Smyth, G., Kapur, N., Wood, K.: SenseCam: A Retrospective Memory Aid. In: Dourish, P., Friday, A. (eds.) UbiComp 2006. LNCS, vol. 4206, pp. 177-193. Springer, Heidelberg (2006).
[3]
Huang, L., Yang, C.: A Novel Approach to Robust Speech Endpoint Detection in Car Environments. In: ICASSP 2000, Istambul, Turkey, vol. 3, pp. 1751-1754 (May 2000).
[4]
Kapur, N.: Compensating for Memory Deficits with Memory Aids. In: Wilson, B. (ed.) Memory Rehabilitation Integrating Theory and Practice, pp. 52-73. Guilford Press, New York.
[5]
Lee, M., Dey, A.: Lifelogging Memory Appliance for People with Episodic Memory Impairment. In: Proc. UbiComp, pp. 44-53 (2008).
[6]
Lu, H., Pan, W., Lane, W., Choudhury, T., Campbell, A.: SoundSense: scalable sound sensing for people-centric applications on mobile phones. In: Proc. MobiSys 2009, pp. 165-178 (2009).
[7]
Miluzzo, E., Cornelius, C., Ramaswamy, A., Choudhury, T., Liu, Z., Campbell, A.: Darwin Phones: the Evolution of Sensing and Inference on Mobile Phones. In: Proc. MobiSys 2010, pp. 5-20 (2010).
[8]
Miluzzo, E., Lane, N., Fodor, K., Peterson, R., Lu, H., Musolesi, M., Eisenman, S., Zheng, X., Campbell, A.: Sensing meets mobile social networks: The design, implementation and evaluation of the CenceMe application. In: Proc. SenSys 2008, pp. 337-350 (2008).
[9]
Power Monitor, http://www.msoon.com/LabEquipment/PowerMonitor/
[10]
Priyantha, B., Lymberopoulos, D., Liu, J.: LittleRock: Enabling Energy Effcient Continuous Sensing on Mobile Phones. IEEE Pervasive Computing Magazine (April-June 2011).
[11]
Rabiner, L.R., Cheng, M.J., Rosenberg, A.E., McGonegal, C.A.: A comparative performance study of several pitchdetection algorithms. IEEE Trans. Acoust., Speech, and Signal Processing, 399-418 (October 1976).
[12]
Rachuri, K., Musolesi, M., Mascolo, C., Rentfrow, P., Longworth, C., Aucinas, A.: EmotionSense: A Mobile Phone based Adaptive Platform for Experimental Social Psychology Research. In: Proc. UbiComp 2010, pp. 281-290 (2010).
[13]
Reynolds, D.A.: An Overview of Automatic Speaker Recognition Technology. In: Proc. Int. Conf. Acoustics, Speech, and Signal Processing, vol. 4, pp. 4072-4075 (2002).
[14]
Reynolds, D.A., Rose, R.C.: Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Trans. Speech Audio Process. 3, 72-83 (1995).
[15]
Saunders, J.: Real time discrimination of broadcast speech/music. In: Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 993-996 (1996).
[16]
Scheirer, E., Slaney, M.: Construction and evaluation of a robust multifeature speech/music discriminator. In: Proc. ICASSP 1998 (May 1998).
[17]
Vemuri, S., Schmandt, C., Bender, W.: iRemember: a Personal, Long-term Memory Prosthesis. In: Proc. CARPE 2006 (2006).
[18]
Viikki, O., Laurila, K.: Cepstral domain segmental feature vector normalization for noise robust speech recognition. Speech Communication 25, 133-147 (1998).
[19]
Wang, Y., Lin, J., Annavaram, M., Jacobson, Q., Hong, J., Krishnamachari, B., Sadeh, N.: A framework of energy efficient mobile sensing for automatic user state recognition. In: Proc. MobiSys, pp. 179-192.
[20]
Zheng, F., Zhang, G., Song, Z.: Comparison of different implementations of MFCC. J. Computer Science & Technology 16(6), 582-589 (2001).

Cited By

View all
  • (2021)A Survey on Deep Learning for Human Activity RecognitionACM Computing Surveys10.1145/347229054:8(1-34)Online publication date: 4-Oct-2021
  • (2019)ABACUSProceedings of the 4th ACM/IEEE Symposium on Edge Computing10.1145/3318216.3363376(395-400)Online publication date: 7-Nov-2019
  • (2019)SoundSemanticsProceedings of the 18th International Conference on Information Processing in Sensor Networks10.1145/3302506.3310402(217-228)Online publication date: 16-Apr-2019
  • Show More Cited By
  1. SpeakerSense: energy efficient unobtrusive speaker identification on mobile phones

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    Pervasive'11: Proceedings of the 9th international conference on Pervasive computing
    June 2011
    369 pages
    ISBN:9783642217258

    Publisher

    Springer-Verlag

    Berlin, Heidelberg

    Publication History

    Published: 12 June 2011

    Author Tags

    1. continuous audio sensing
    2. energy efficiency
    3. heterogeneous multi-processor hardware
    4. mobile phones
    5. speaker identification

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)A Survey on Deep Learning for Human Activity RecognitionACM Computing Surveys10.1145/347229054:8(1-34)Online publication date: 4-Oct-2021
    • (2019)ABACUSProceedings of the 4th ACM/IEEE Symposium on Edge Computing10.1145/3318216.3363376(395-400)Online publication date: 7-Nov-2019
    • (2019)SoundSemanticsProceedings of the 18th International Conference on Information Processing in Sensor Networks10.1145/3302506.3310402(217-228)Online publication date: 16-Apr-2019
    • (2018)Vietnamese Speaker Authentication Using Deep ModelsProceedings of the 9th International Symposium on Information and Communication Technology10.1145/3287921.3287954(177-184)Online publication date: 6-Dec-2018
    • (2018)Vocal ResonanceProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/31917512:1(1-23)Online publication date: 26-Mar-2018
    • (2018)rConverseProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/31917342:1(1-27)Online publication date: 26-Mar-2018
    • (2017)Low-resource Multi-task Audio Sensing for Mobile and Embedded Devices via Shared Deep Neural Network RepresentationsProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/31318951:3(1-19)Online publication date: 11-Sep-2017
    • (2017)Toward Accurate and Efficient Feature Selection for Speaker Recognition on WearablesProceedings of the 2017 Workshop on Wearable Systems and Applications10.1145/3089351.3089352(41-46)Online publication date: 19-Jun-2017
    • (2017)Accelerating Mobile Audio Sensing Algorithms through On-Chip GPU OffloadingProceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services10.1145/3081333.3081358(306-318)Online publication date: 16-Jun-2017
    • (2017)GlimpseProceedings of the 15th Annual International Conference on Mobile Systems, Applications, and Services10.1145/3081333.3081347(292-305)Online publication date: 16-Jun-2017
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media