Abstract
Evaluation of video summarization approaches requires more information on the user-perceived qualities of different types of summaries. Also, evaluation measures need to be further developed in a user-led manner. This article reports on a user-centered evaluation of visual video summaries. Four types of summaries (fastforward, user-controlled fastforward, scene clips, and storyboard) were evaluated with a set of existing performance and satisfaction measures. A repertory grid elicitation was conducted with our participants gathering evaluation constructs related to both video summary content and controls. Results showed a lack of correlation between performance and satisfaction measures. User-supplied evaluation constructs were shown to span both the performance and satisfaction dimensions of the video summary evaluation space. Most constructs achieved moderate to good inter-rater agreement in a consequent survey. Free descriptions of videos and respective summaries showed that while users are able to interpret object- and event-related information from short summaries, thematic inference lacked, leading to worse descriptions than for the full videos.
Similar content being viewed by others
References
Anon: Open video digital archive (2010). http://www.open-video.org
Balatsoukas P., Morris A., O’Brien A.: An evaluation framework of user interaction with metadata surrogates. J. Inf. Sci. 35, 321–339 (2009). doi:https://doi.org/10.1177/0165551508099090
Benini S., Migliorati P., Leonardi R.: Statistical skimming of feature films. Int. J. Digital Multimedia Broadcast. 2010, 1–12 (2010)
Christel, M.G.: Evaluation and user studies with respect to video summarization and browsing. In: Proceedings of SPIE Multimedia Content Analysis, Management and Retrieval, vol. 6073 (2006)
Christel, M.G., Lin, W.H., Maher, B.: Evaluating audio skimming and frame rate acceleration for summarizing bbc rushes. In: Proceedings of the 2008 International Conference on Content-based image and video retrieval, CIVR ’08, pp. 407–416. ACM, New York, NY, USA (2008). doi:https://doi.org/10.1145/1386352.1386405
Christel, M.G., Smith, M.A., Taylor, C.R., Winkler, D.B.: Evolving video skims into useful multimedia abstractions. In: Proceedings of the SIGCHI Conference on Human factors in computing systems, CHI ’98, pp. 171–178. ACM Press/Addison-Wesley Publishing Co., New York, NY, USA (1998)
Cohen J.: Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychol. Bull. 70, 213–220 (1968)
Corchs, S., Ciocca, G., Schettini, R.: Video summarization using a neurodynamical model of visual attention. In: Proceedings of IEEE 6th Workshop on Multimedia Signal Processing, pp. 71–74 (2004). doi:https://doi.org/10.1109/MMSP.2004.1436419
de Avila S.E.F., Lopes A.P.B., da Luz A., de Albuquerque Araújo A.: Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method. Pattern Recogn. Lett. 32(1), 56–68 (2011). doi:https://doi.org/10.1016/j.patrec.2010.08.004
Dillon A., McKnight C.: Towards a classification of text types: a repertory grid approach. Int. J. Man-Mach. Stud. 33, 623–636 (1990). doi:https://doi.org/10.1016/S0020-7373(05)80066-5
Dumont E., Mérialdo B.: Rushes video summarization and evaluation. Multimedia Tools Appl. 48, 51–68 (2010). doi:https://doi.org/10.1007/s11042-009-0374-9
Fayzullin, M., Subrahmanian, V.S., Albanese, M., Picariello, A.: The priority curve algorithm for video summarization. In: Proceedings of the 2nd ACM International Workshop on Multimedia databases, MMDB ’04, pp. 28–35. ACM, New York, NY, USA (2004). doi:https://doi.org/10.1145/1032604.1032611
Goodrum A.A.: Multidimensional scaling of video surrogates. J. Am. Soc. Inf. Sci. Technol. 52, 174–182 (2001). doi:https://doi.org/10.1002/1097-4571
Guironnet M., Pellerin D., Guyader N., Ladret P.: Video summarization based on camera motion and a subjective evaluation method. J. Image Video Process. 2007, 60245 (2007)
He, L., Sanocki, E., Gupta, A., Grudin, J.: Auto-summarization of audio-video presentations. In: Proceedings of the 7th ACM International Conference on Multimedia (Part 1), MULTIMEDIA ’99, pp. 489–498. ACM, New York, NY, USA (1999). doi:https://doi.org/10.1145/319463.319691
Herranz L., Martinez J.: A framework for scalable summarization of video. IEEE Trans. Circuits Syst. 20(9), 1265–1270 (2010). doi:https://doi.org/10.1109/TCSVT.2010.2057020
Jaimes, A., Echigo, T., Teraguchi, M., Satoh, F.: Learning personalized video highlights from detailed mpeg-7 metadata. In: Proceedings of International Conference on Image Processing, vol. 1, pp. I–133 – I–136 vol.1 (2002). doi:https://doi.org/10.1109/ICIP.2002.1037977
Johnson F.C., Crudge S.E.: Using the repertory grid and laddering technique to determine the user’s evaluative model of search engines. J. Doc. 63, 259–280 (2007). doi:https://doi.org/10.1108/00220410710737213
Komlodi, A., Marchionini, G.: Key frame preview techniques for video browsing. In: Proceedings of the 3rd ACM Conference on Digital libraries, DL ’98, pp. 118–125. ACM, New York, NY, USA (1998). doi:https://doi.org/10.1145/276675.276688
Kopf, S., Haenselmann, T., Farin, D., Effelsberg, W.: Automatic generation of video summaries for historical films. In: Proceedings of IEEE International Conference on Multimedia and Expo, ICME ’04. 2004, vol. 3, pp. 2067–2070 (2004). doi:https://doi.org/10.1109/ICME.2004.1394672
Li Y., Narayanan S., Kuo C.: Movie content analysis, indexing and skimming via multimodal information. In: Rosenfeld, A., Doermann, D., Dementhon, D. (eds.) Video Mining, Chapt. 5, Kluwer Academic Publishers, Boston (2003)
Ma, Y.F., Lu, L., Zhang, H.J., Li, M.: A user attention model for video summarization. In: Proceedings of the 10th ACM International Conference on Multimedia, MULTIMEDIA ’02, pp. 533–542. ACM, New York, NY, USA (2002). doi:https://doi.org/10.1145/641007.641116
Marchionini, G.: Human performance measures for video retrieval. In: Proceedings of the 8th ACM International Workshop on Multimedia information retrieval, MIR ’06, pp. 307–312. ACM, New York, NY, USA (2006). doi:https://doi.org/10.1145/1178677.1178720
Marchionini G., Song Y., Farrell R.: Multimedia surrogates for video gisting: Toward combining spoken words and imagery. Inf. Process. Manage. 45, 615–630 (2009). doi:https://doi.org/10.1016/j.ipm.2009.05.007
Marchionini G., Wildemuth B.M., Geisler G.: The open video digital library: A möbius strip of research and practice. J. Am. Soc. Inf. Sci. Technol. 57, 1629–1643 (2006). doi:https://doi.org/10.1002/asi.v57:12
McKnight, C.: The personal construction of information space. J. Am. Soc. Inf. Sci. 51, 730–733 (2000). https://doi.org/http://dx.doi.org/10.1002
Mei T., Yang B., Yang S.Q., Hua X.S.: Video collage: presenting a video sequence using a single image. Vis. Comput. 25, 39–51 (2008). doi:https://doi.org/10.1007/s00371-008-0282-4
Money A.G., Agius H.: Video summarisation: A conceptual framework and survey of the state of the art. J. Vis. Commun. Image Represent. 19, 121–143 (2008). doi:https://doi.org/10.1016/j.jvcir.2007.04.002
Ngo, C.W., Ma, Y.F., Zhang, H.J.: Automatic video summarization by graph modeling. In: Proceedings of the 9th IEEE International Conference on Computer Vision, vol. 2, ICCV ’03, p. 104. IEEE Computer Society, Washington, DC, USA (2003)
Oppenheim C., Stenson J., Wilson R.M.S.: Studies on information as an asset i: Definitions. J. Inf. Sci. 29(3), 159–166 (2003). doi:https://doi.org/10.1177/01655515030293003
Over, P., Smeaton, A.F., Awad, G.: The trecvid 2008 bbc rushes summarization evaluation. In: Proceedings of the 2nd ACM TRECVid Video Summarization Workshop, TVS ’08, pp. 1–20. ACM, New York, NY, USA (2008). doi:https://doi.org/10.1145/1463563.1463564
Over, P., Smeaton, A.F., Kelly, P.: The trecvid 2007 bbc rushes summarization evaluation pilot. In: Proceedings of the international workshop on TRECVID video summarization, TVS ’07, pp. 1–15. ACM, New York, NY, USA (2007). doi:https://doi.org/10.1145/1290031.1290032
Smith, M., Kanade, T.: Video skimming and characterization through the combination of image and language understanding. In: Proceedings IEEE International Workshop on Content-Based Access of Image and Video Database, pp. 61 –70 (1998). doi:https://doi.org/10.1109/CAIVD.1998.646034
Sundaram, H., Chang, S.F.: Condensing computable scenes using visual complexity and film syntax analysis. Proceedings of IEEE International Conference on Multimedia and Expo 0, 70 (2001). doi:https://doi.org/10.1109/ICME.2001.1237709
Sundaram, H., Xie, L., Chang, S.F.: A utility framework for the automatic generation of audio-visual skims. In: Proceedings of the 10th ACM international conference on Multimedia, MULTIMEDIA ’02, pp. 189–198. ACM, New York, NY, USA (2002). doi:https://doi.org/10.1145/641007.641042
Takahashi, Y., Nitta, N., Babaguchi, N.: Video summarization for large sports video archives. In: Proceedings of IEEE International Conference on Multimedia and Expo, vol. 0, pp. 1170–1173. IEEE Computer Society, Los Alamitos, CA, USA (2005). doi:https://doi.org/10.1109/ICME.2005.1521635
Tan F.B., Hunter M.G.: The repertory grid technique: A method for the study of cognition in information systems. MIS Q. 26(1), 39–57 (2002)
Taskiran, C.: Evaluation of automatic video summarization systems. In: Proceedings of SPIE Multimedia content analysis, management and retrieval (2006)
Taskiran, C.M., Bentley, F.: Automatic and user-centric approaches to video summary evaluation. In: A. Hanjalic, R. Schettini, N. Sebe (eds.) Multimedia Content Access: Algorithms and Systems, vol. 6506, p. 650607. SPIE (2007). doi:https://doi.org/10.1117/12.713913
Taskiran C.M., Pizlo Z., Amir A., Ponceleon D.B., Delp E.J.: Automated video program summarization using speech transcripts. IEEE Trans. Multimedia 8(4), 775–791 (2006)
Truong, B.T., Venkatesh, S.: Video abstraction: A systematic review and classification. ACM Trans. Multimedia Comput. Commun. Appl. 3 (2007). doi:https://doi.org/10.1145/1198302.1198305
Tsoneva, T., Barbieri, M., Weda, H.: Automated summarization of narrative video on a semantic level. In: Proceedings of the International Conference on Semantic Computing, pp. 169–176. IEEE Computer Society, Washington, DC, USA (2007). doi:https://doi.org/10.1109/ICSC.2007.16
Westman, S., Laine-Hernandez, M., Oittinen, P.: Development and evaluation of a multifaceted magazine image categorization model. J. Am. Soc. Inf. Sci. Technol. (2010). doi:https://doi.org/10.1002/asi.21463
Wildemuth, B.M., Marchionini, G., Yang, M., Geisler, G., Wilkens, T., Hughes, A., Gruss, R.: How fast is too fast?: Evaluating fast forward surrogates for digital video. In: Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, JCDL ’03, pp. 221–230. IEEE Computer Society, Washington, DC, USA (2003)
Wildemuth, B.M., Russell, T., Ward, T., Marchionini, G., Oh, S.: The influence of context and interactivity on video browsing. Tech. Rep. SILS Technical Report 2006-01, University of North Carolina, School of Information and Library Science (2006). https://doi.org/http://www.ils.unc.edu/ils/research/TR-2006-1.pdf
Yang, M., Marchionini, G.: Deciphering visual gist and its implications for video retrieval and interface design. In: CHI ’05 extended abstracts on Human factors in computing systems, CHI ’05, pp. 1877–1880. ACM, New York, NY, USA (2005). doi:https://doi.org/10.1145/1056808.1057045
Yang, M., Wildemuth, B.M., Marchionini, G., Wilkens, T., Geisler G., Hughes, A., Gruss, R., Webster, C.: Measures of user performance in video retrieval research. Tech. Rep. SILS Technical Report 2003-02, University of North Carolina, School of Information and Library Science (2003). https://doi.org/http://www.ils.unc.edu/ils/research/TR-2003-02.pdf
Zhang X., Chignell M.: Assessment of the effects of user characteristics on mental models of information retrieval systems. J. Am. Soc. Inf. Sci. Technol. 52, 445–459 (2001). doi:https://doi.org/10.1002/1532-2890
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is a substantially revised and extended version of a paper (Evaluation Constructs for Visual Video Summaries) which originally appeared in the Proceedings of the 14th European Conference on Digital Libraries (ECDL 2010).
Rights and permissions
About this article
Cite this article
Westman, S. Evaluation of visual video summaries: user-supplied constructs and descriptions. Int J Digit Libr 11, 125–140 (2010). https://doi.org/10.1007/s00799-011-0071-y
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00799-011-0071-y