[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/641007.641111acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
Article

Portable meeting recorder

Published: 01 December 2002 Publication History

Abstract

The design and implementation of a portable meeting recorder is presented. Composed of an omni-directional video camera with four-channel audio capture, the system saves a view of all the activity in a meeting and the directions from which people spoke. Subsequent analysis computes metadata that includes video activity analysis of the compressed data stream and audio processing that helps locate events that occurred during the meeting. Automatic calculation of the room in which the meeting occurred allows for efficient navigation of a collection of recorded meetings. A user interface is populated from the metadata description to allow for simple browsing and location of significant events.

References

[1]
Foote, J. and Kimber, D., "FlyCam: Practical panoramic video and automatic camera control," Proceedings of International Conference on Multimedia & Expo, vol.3, pp. 1419--1422, 2000.
[2]
Gross, R., Bett, M. Yu, H., Zhu, X., Pan, Y., Yang, J., Waibel, A., "Towards a multimodal meeting record," Proceedings of International Conference on Multimedia and Expo, pp. 1593--1596, New York, 2000.
[3]
Sun, X., Foote, J., Kimber, D., and Manjunath, "Panoramic video capturing and compressed domain virtual camera control", ACM Multimedia, pp. 229--238, 2001.
[4]
Rui, Y., Gupta, A., and Cadiz, J., "Viewing meetings captured by an omni-directional camera", ACM CHI 2001, pp. 450--457, Seattle, March 31- April 4, 2001.
[5]
Waibel, A., Bett, M., Metze, F., Ries, K., Schaaf, T., Schultz, T., Soltau, H., Yu, H., and Zechner, K., "Advances in automatic meeting record creation and access", Proceedings of the International Conference on Acoustics, Speech, and Signal Processing, 597--600, 2001.
[6]
Hauptmann, A. G., and Smith, M., "Text speech and vision for video segmentation: The informedia project," Proceedings of the AAAI Fall Symposium on Computational Models for Integrating Language and Vision, 1995.
[7]
Maybury, M., Merlino, A., and Rayson, J., "Segmentation, content extraction and visualization of broadcast news video using multistream analysis", AAAI, 1997.
[8]
Myers, B. A., Casares, J. P., Stevens, S., Dabbish, L., Yocum, D., and Corbett, A., "A multi-view intelligent editor for digital video libraries", Joint Conference on Digital Libraries, Roanoke, VA, June 24--28, 2001.
[9]
Foote, J., Boreczky, J., Girgensohn, A., and Wilcox, L., "An intelligent media browser using automatic multimodal analysis", ACM Multimedia, pp. 375--380, 1998.
[10]
Lee, D. "Segmenting People in Meeting Videos Using Mixture Background and Object Models," Proc. of Pacific Rim Conf. on Multimedia, Taiwan, Dec. 16--18, 2002.
[11]
Stauffer, C. and Grimson, W.E.L, "Adaptive Background Mixture Models for Real-Time Tracking," Proceedings of Computer Vision and Pattern Recognition, pp. 246--252, 1999.
[12]
Gross, R., Yang, J., Waibel, A., "Face Recognition in a Meeting Room", IEEE International Conference on Automatic Face and Gesture Recognition, 294--299, 2000.
[13]
Hsu, R.L., Abdel-Mottaleb, M., and Jain, A. K., "Face detection in color images", Proc. International Conference on Image Processing, pp. 1046--1049, 2001.
[14]
Yang, M.H., Kriegman, D.J., Ahuja, N., "Detecting Faces in Images: A Survey", PAMI(24), No. 1, pp. 34--58, January 2002.
[15]
Kapralos, B., Jenkin, M., Milios E., and Tsotsos, J.: "Eyes 'n Ears Face Detection", 2001 International Conference on Image Processing, vol 1, pp. 66--69, 2001.
[16]
Abdel-Mottaleb, M. and Elgammal, A., "Face Detection in complex environments from color images," IEEE ICIP, pp. 622--626, Oct. 1999.
[17]
Yang, J., Zhu, X., Gross, R., Kominek, J., Y. Pan, Waibel, A., "Multimodal People ID for a Multimedia Meeting Browser," Proceedings of ACM Multimedia, pp. 159--168, 1999.
[18]
Pingali, G. S., Opalach, A., Carlbom, I., "Multimedia retrieval through spatio-temporal activity maps", ACM Multimedia, pp. 129--136, 2001.
[19]
Divakaran, A., Vetro, A., Asai, K., Nishikawa, H., "Video browsing system based on compressed domain feature extraction", IEEE Transactions on Consumer Electronics, vol. 46, pp. 637--644, 2000.
[20]
Erol, B., Kossentini, F., "Local motion descriptors", IEEE Workshop on Multimedia Signal Processing, pp. 467--472, 2001.
[21]
Dorai, C., Kobla, V., "Perceived visual motion descriptors from MPEG-2 for content-based HDTV annotation and retrieval", IEEE 3rd Workshop on Multimedia Signal Processing, pp. 147--152, 1999.
[22]
Sun, X., Divakaran, A., Manjunath, B.S., "A motion activity descriptor and its extraction in compressed domain," Proc. IEEE Pacific-Rim Conference on Multimedia (PCM '01), pp. 450--457, 2001.
[23]
ISO/IEC JTC1/SC29/WG11, "Multimedia Content Description Interface - Part 3 Visual". Publicly available at http://mpeg.telecomitalialab.com/ working_documents.htm, March 2001.
[24]
Aramvith, S., and Sun, M.T., "MPEG-1 and MPEG-2 video standards", Handbook of Image and Video Processing, pp. 597--610, Academic Publishers, 2000.
[25]
ISO/IEC, "Information technology - generic coding of moving pictures and associated audio information: Video," 13818-2, 1995.
[26]
Arons, B., "Speech skimmer: A system for interactively skimming recorded speech", ACM Transactions on Computer-Human Interaction, vol 4, pp. 3--38, 1997.
[27]
Pfau, T., Ellis, D.P.W., and Stolcke, A., "Multispeaker Speech Activity Detection for the ICSI Meeting Recorder", Proc. IEEE Automatic Speech Recognition and Understanding Workshop, 2001.
[28]
Kimber, D., and L. Wilcox, L., "Acoustic segmentation for audio browsers," in Proc. Interface Conference. Sydney, Australia, 1996.
[29]
Tritschler, A. and Gopinath, R., "Improved Speaker Segmentation and Segments Clustering using the Bayesian Information Criterion", Proc. of Eurospeech, pp. 679--682, 1999.
[30]
Johnson, S.E., "Who Spoke When? - Automatic Segmentation and Clustering for Determining Speaker Turns", Proc. Eurospeech, Vol. 5, pp. 2211--2214, 1999.
[31]
Graham, J., "The MuVIE Client System: A Multimedia Visualization and Integration Environment," Ricoh Innovations, March 2002.

Cited By

View all
  • (2023)Videoconference interpreting goes multimodalInterpreting Technologies – Current and Future Trends10.1075/ivitra.37.07zha(169-194)Online publication date: 25-Sep-2023
  • (2020)A Robust Tracking-by-Detection Algorithm Using Adaptive Accumulated Frame Differencing and Corner FeaturesJournal of Imaging10.3390/jimaging60400256:4(25)Online publication date: 21-Apr-2020
  • (2020)Body Movement Synchrony Captured by an Omnidirectional Camera predicts the Degree of Information Transfer during Dialogue: Toward Automatic Evaluation of Verbal Communication2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)10.1109/CogInfoCom50765.2020.9237916(000215-000220)Online publication date: 23-Sep-2020
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
MULTIMEDIA '02: Proceedings of the tenth ACM international conference on Multimedia
December 2002
683 pages
ISBN:158113620X
DOI:10.1145/641007
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 2002

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MPEG-2 compressed domain analysis
  2. appliance
  3. audio processing
  4. meeting recorder
  5. omni-directional video

Qualifiers

  • Article

Conference

MM02: ACM Multimedia 2002
December 1 - 6, 2002
Juan-les-Pins, France

Acceptance Rates

MULTIMEDIA '02 Paper Acceptance Rate 46 of 330 submissions, 14%;
Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Videoconference interpreting goes multimodalInterpreting Technologies – Current and Future Trends10.1075/ivitra.37.07zha(169-194)Online publication date: 25-Sep-2023
  • (2020)A Robust Tracking-by-Detection Algorithm Using Adaptive Accumulated Frame Differencing and Corner FeaturesJournal of Imaging10.3390/jimaging60400256:4(25)Online publication date: 21-Apr-2020
  • (2020)Body Movement Synchrony Captured by an Omnidirectional Camera predicts the Degree of Information Transfer during Dialogue: Toward Automatic Evaluation of Verbal Communication2020 11th IEEE International Conference on Cognitive Infocommunications (CogInfoCom)10.1109/CogInfoCom50765.2020.9237916(000215-000220)Online publication date: 23-Sep-2020
  • (2018)Post-meeting Curation of Whiteboard Content Captured with Mobile DevicesProceedings of the 2018 ACM International Conference on Interactive Surfaces and Spaces10.1145/3279778.3279782(43-54)Online publication date: 19-Nov-2018
  • (2018)Designing prosthetic memoryProceedings of the 2018 International Conference on Advanced Visual Interfaces10.1145/3206505.3206545(1-9)Online publication date: 29-May-2018
  • (2016)SmartCameraMultimedia Tools and Applications10.1007/s11042-015-2700-875:13(7831-7854)Online publication date: 1-Jul-2016
  • (2015)Tools and evaluation methods for discussion and presentation skills trainingSmart Learning Environments10.1186/s40561-015-0011-12:1Online publication date: 24-Feb-2015
  • (2015)Biometric-Based User Authentication and Activity Level Detection in a Collaborative EnvironmentTransparency in Social Media10.1007/978-3-319-18552-1_9(165-180)Online publication date: 2015
  • (2014)Interactive Multimodal Information ManagementMultimodal Interactive Systems Management10.1201/b15535-2(1-17)Online publication date: 2-Apr-2014
  • (2014)A Smart Meeting Management System With Video Based Seat DetectionProceedings of International Conference on Internet Multimedia Computing and Service10.1145/2632856.2632874(232-236)Online publication date: 10-Jul-2014
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media