Only overlay text: novel features for TV news broadcast video segmentation

Raghvendra Kannao ORCID: orcid.org/0000-0003-2083-2560¹,
Prithwijit Guha¹ &
Bidyut B. Chaudhuri²

267 Accesses
3 Citations
1 Altmetric
Explore all metrics

Abstract

Segmentation of television news videos into programs and stories (after removing advertisements) is a necessary first step for news broadcast analysis. Existing methods have used manually defined presentation styles as an important feature for such segmentation. Manually defined presentation styles make algorithms channel specific and hampers scalability for large number of channels. In this work, we advocate the usebility of overlay text for automatic characterization of broadcast presentation styles. This automatic characterization will minimize the manual intervention required in developing the scalable solutions for television news broadcast segmentation. To this end, we introduce three novel features solely derived from position and content of overlay text bands. These are Bag of Bands (BoB), BoB Templates (BoBT) and Text-based Semantic Similarity (TSS). The BoB features characterize on-screen distribution of text bands and are used with classifiers for advertisement detection. The BoBT features characterize co-occurrence of text bands. Thereby modeling the presentation styles of video shots. Sequences of BoBT features are modeled using Conditional Random Fields (CRFs) for identifying program boundaries. Sequences of features derived from semantic similarity (TSS) between consecutive shots and BoBT feature are used with CRFs for story segmentation. Performances of the proposed features are validated on 360 hours of broadcast data recorded from three Indian English news channels. Benchmark on baseline methods has shown better performance of our proposal.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price includes VAT (United Kingdom)

Instant access to the full article PDF.

Institutional subscriptions

Segmenting with style: detecting program and story boundaries in TV news broadcast videos

Article 27 July 2019

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

A system for semantic segmentation of TV news broadcast videos

Article 13 December 2019

Notes

A manual analysis of our dataset reveals text IUs to be about 87% of total IUs
Supplementary material can be accessed using http://tiny.cc/boTB

References

An E, Ji A, Ng E (2019) Large scale video classification using both visual and audio features on youtube-8 m dataset
Browne P, Czirjek C, Gurrin C, Jarina R, Lee H, Marlow S, McDonald K, Murphy N, O’Connor N E, Smeaton A F et al (2002) Dublin city university video track experiments for trec 2002. In: The Eleventh Text Retrieval Conference. NIST
Chaisorn L, Chua T-S, Koh C-K, Zhao Y, Xu H, Feng H, Tian Q (2003) A two-level multi-modal approach for story segmentation of large news video corpus. In: TRECVID conference,(gaithersburg, washington dc, november 2003). published on-line at http://www.nlpir.nist.gov/projects/tvpubs/tv.pubs.org.html
Charlet D, Damnati G, Bouchekif A, Douib A (2015) Fusion of speaker and lexical information for topic segmentation: A co-segmentation approach. In: International Conference on Acoustics, Speech and Signal Processing. IEEE, pp 5261–5265
Chatzis S P, Demiris Y (2013) The infinite-order conditional random field model for sequential data modeling. IEEE Trans Pattern Anal Mach Intell 35 (6):1523–1534
Article Google Scholar
Chen L, Shen J, Wang W, Ni B (2015) Video object segmentation via dense trajectories. IEEE Trans Multimed 17(12):2225–2234
Article Google Scholar
Chua T-S, Chang S-F, Chaisorn L, Hsu W (2004) Story boundary detection in large broadcast news video archives: techniques, experience and trends. In: International conference on Multimedia. ACM, pp 656–659
Claveau V, Lefèvre S (2015) Topic segmentation of TV-streams by watershed transform and vectorization. Comput Speech Lang 29(1):63–80
Article Google Scholar
Cristianini N, Kandola J, Elisseeff A, Shawe-Taylor J (2006) On kernel target alignment. In: Innovations in Machine Learning. Springer, pp 205–256
Dietterich T G (2002) Machine learning for sequential data: A review. In: Structural, syntactic, and statistical pattern recognition. Springer, pp 15–30
Dimitrova N, Agnihotri L, Wei G (2000) Video classification based on hmm using text and faces. In: European Signal Processing Conference. IEEE, pp 1–4
Direkoglu C, O’Connor N E (2018) Temporal segmentation and recognition of team activities in sports. Mach Vis Appl 29(5):891–913
Article Google Scholar
Duygulu P, yu Chen M, Hauptmann A (2004) Comparison and combination of two novel commercial detection methods. In: International Conference on Multimedia and Expo, vol 2. IEEE, pp 1267–1270
Feng B, Chen Z, Zheng R, Xu B (2014) Multiple style exploration for story unit segmentation of broadcast news video. Multimed Syst 20(4):347–361
Article Google Scholar
Feng B, Ding P, Chen J, Bai J, Xu S, Xu B (2012) Multi-modal information fusion for news story segmentation in broadcast video. In: International Conference on Acoustics, Speech and Signal Processing, pp 1417–1420
Ghosh H, Kopparapu S K, Chattopadhyay T, Khare A, Wattamwar S S, Gorai A, Pandharipande M (2010) Multimodal indexing of multilingual news video. International Journal of Digital Multimedia Broadcasting
Gunter B (2015) The cognitive impact of television news: production attributes and information reception. Springer
Hachten W A, Scotton J F (2015) The world news prism: Digital, social and interactive. Wiley
Hua X-S, Lu L, Zhang H-J (2005) Robust learning-based TV commercial detection. In: International Conference on Multimedia and Expo. IEEE, pp 48–52
IP Television Magazine (2018) Content Aggregators. http://www.iptvmagazine.com/iptvmagazine_directory_content_aggregator.html, Online; accessed September
Jindal A, Tiwari A, Ghosh H (2011) Efficient and language independent news story segmentation for telecast news videos. In: International Symposium on Multimedia. IEEE, pp 458–463
Kannao R, Guha P (2016) Generic TV advertisement detection using progressively balanced perceptron trees. In: Indian Conference on Computer Vision, Graphics and Image Processing. ACM, pp 164–172
Kannao R, Guha P (2015) Overlay text extraction from TV news broadcast. In: Annual IEEE India Conference. IEEE, pp 1–6
Kannao R, Guha P (2016) Story segmentation in TV news broadcast videos. In: International Conference on Pattern Recognition. IEEE
Kannao R, Guha P (2016) TV commercial detection using success based locally weighted kernel combination. In: Multimedia Modeling. Springer, pp 793–805
Kannao R, Guha P (2017) Success based locally weighted multiple kernel combination. Pattern Recogn 68(4):38–51. https://doi.org/10.1016/j.patcog.2017.02.029
Article Google Scholar
Kannao R, Guha P (2019) Segmenting with style: detecting program and story boundaries in TV news broadcast videos. Multimed Tools Appl 78 (22):31925–31957
Article Google Scholar
Kim J W, Cho S-H (2014) Effectively detecting topic boundaries in a news video by using wikipedia. Int J Softw Eng Appl 8(6):229–240
Google Scholar
Kim W, Park J, Kim C (2010) A novel method for efficient indoor–outdoor image classification. Signal Process Syst 61(3):251–258
Article Google Scholar
Kraaij W, Smeaton A F, Over P (2004) TRECVid 2004 - an overview. Technical Report, http://doras.dcu.ie/411/1/trecvid_2004_3.pdf
Lafferty J D, McCallum A, Pereira FCN (2001) Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: International Conference on Machine Learning. Morgan Kaufmann Publishers Inc., pp 282–289
Li H, Jou B, Ellis J G, Morozoff D, Chang S-F (2013) News rover: Exploring topical structures and serendipity in heterogeneous multimedia news. In: International conference on Multimedia. ACM, pp 449–450
Lienhart R (2003) Video OCR: A survey and practitioner’s guide. In: Rosenfeld, Azriel, Daniel D D, DeMenthon (eds) Video Mining, The Springer International Series in Video Computing, vol 6. Springer US, pp 155–183
Liu N, Zhao Y, Zhu Z, Lu H (2011) Exploiting visual-audio-textual characteristics for automatic TV commercial block detection and segmentation. IEEE Trans Multimed 13(5):961–973
Article Google Scholar
Liu Z, Wang Y (2018) TV news story segmentation using deep neural network. In: International Conference on Multimedia & Expo Workshops . IEEE, pp 19–24
Lu X, Leung C-C, Xie L, Ma B, Li H (2013) Broadcast news story segmentation using latent topics on data manifold. In: International Conference on Acoustics, Speech and Signal Processing. IEEE, pp 8465–8469
Misra H, Hopfgartner F, Goyal A, Punitha P, Jose J M (2010) TV news story segmentation based on semantic coherence and content similarity. In: Advances in Multimedia Modeling. Springer, pp 347–357 Montes GómezA,Temporalactivitydetectioninuntrimmedvideoswithrecurrent neuralnetworks.B.S.thesis,UniversitatPolitècnicadeCatalunya,2016.
Mühling M, Ewerth R, Stadelmann T, ZöfelC, Shi B, Freisleben B (2007) University of Marburg at TRECVid 2007: Shot boundary detection and high level feature extraction. In: TREC Video Retrieval Evaluation - 2007. NIST. http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.4.org.html.NIST
Nakamura Y, Kanade T (1997) Semantic analysis for video contents extractionspotting by association in news video. In: International conference on Multimedia. ACM, pp 393–401
Perebinossoff P, Gross B, Gross LS (2005) Programming for TV, radio, and the internet: strategy, development, and evaluation. Taylor & Francis
Quśenot GM, Moraru D, Ayache S, Charhad M, el Guironnet M, Carminati L, Mulhem P, ome Gensel J, Pellerin D, Besacier L (2004) Clips-lis-lsr-labri experiments at TRECVid 2004. In: TREC Video Retrieval Evaluation - 2004. NIST. http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.4.org.html.NIST
Renoust B, Le D-D, Satoh SI (2016) Visual analytics of political networks from face-tracking of news video. IEEE Trans Multimed 18(11):2184–2195
Google Scholar
Salton G, Wong A, Yang C-S (1975) A vector space model for automatic indexing. Commun ACM 18(11):613–620
MATH Google Scholar
Shen J, Peng J, Shao L (2018) Submodular trajectories for better motion segmentation in videos. IEEE Trans Image Process 27(6):2688–2700
Google Scholar
Smeaton AF, Over P, Doherty AR (2010) Video shot boundary detection: Seven years of TRECVid activity. Comput Vis Image Underst 114(4):411–418
Google Scholar
Smeaton AF, Over P, KraaijW(2006) Evaluation campaigns and TRECVid. In: InternationalWorkshop on Multimedia Information Retrieval. ACM, pp 321–330
Smola AJ, Vishwanathan S (2003) Fast kernels for string and tree matching. In: Advances in Neural Information Processing Systems, pp 585–592
Su X, Lan Y,Wan R, Qin Y (2009) A fast incremental clustering algorithm. In: International Symposium on Information Processing, pp 175–178
Trojahn TH, Goularte R (2021) Temporal video scene segmentation using deep-learning. Multimed Tools Appl:1–27
Volkmer T, Tahahoghi SMM, Williams HE (2004) RMIT university at TRECVid 2004. In: TREC Video Retrieval Evaluation - 2004. NIST. http://www-nlpir.nist.gov/projects/tvpubs/tv.pubs.4.org.html.NIST
Wang W, Shen J, Porikli F (2015) Saliency-aware geodesic video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3395–3402
Wang W, Shen J, Porikli F, Yang R (2018) Semi-supervised video object segmentation with supertrajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence
Wang W, Shen J, Shao L (2018) Video salient object detection via fully convolutional networks. IEEE Trans Image Process 27(1):38–49
Google Scholar
Wang W, Shen J, Yang R, Porikli F (2018) Saliency-aware video object segmentation. IEEE Trans Pattern Anal Mach Intell (1) 20–33
Wang X, Zheng S, Zhang C, Li R, Gui L (2021) R-yolo: A real-time text detector for natural scenes with arbitrary rotation. Sensors 21(3):888
Article Google Scholar
Wikipedia(2016)Dayparting—Wikipedia,the free encyclopedia.https://en.wikipedia.org/wiki/Dayparting,[Online;accessedJanuary-2017]
Wu J, Kuang Z, Wang L, Zhang W, Wu G (2020) Context-aware rcnn: A baseline for action detection in videos. In: European Conference on Computer Vision. Springer, pp 440–456
Wu X, Satoh S (2013) Ultrahigh-speed TV commercial detection, extraction and matching. IEEE Trans Circ Syst Video Technol 23(6):1054–1069
Google Scholar
Xu S, Feng B, Chen Z, Xu B (2013) A general framework of video segmentation to logical unit based on conditional random fields. In: International conference on multimedia retrieval. ACM, pp 247–254
Xu Z, Hu J, Deng W (2016) Recurrent convolutional neural network for video classification. In: IEEE International Conference on Multimedia and Expo. IEEE, pp 1–6
X.Wang, Z.Guo (2008) A novel real-time commercial detection scheme. In: International Conference on Innovative Computing Information and Control, pp 536–536
Zhang L, Zhu Z, Zhao Y (2007) Robust commercial detection system. In: International Conference on Multimedia and Expo. IEEE, pp 587–590
Zhou H,Hermans T,Karandikar A V,Rehg J M(2010)Movie genre classification via scene categorization.In: International conference on Multimedia. ACM,pp747–750
Zlitni T, Bouaziz B, Mahdi W (2015) Automatic topics segmentation for TV news video using prior knowledge. Multimed Tools Appl:1–28

Download references

Author information

Authors and Affiliations

Department of Electronics and Electrical Engineering, Indian Institute of Technology Guwahati, North Guwahati, Assam, 781039, India
Raghvendra Kannao & Prithwijit Guha
Indian Statistical Institute Kolkata, Kolkata, West Bengal, 700108, India
Bidyut B. Chaudhuri

Authors

Raghvendra Kannao
View author publications
You can also search for this author in PubMed Google Scholar
Prithwijit Guha
View author publications
You can also search for this author in PubMed Google Scholar
Bidyut B. Chaudhuri
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Raghvendra Kannao.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kannao, R., Guha, P. & Chaudhuri, B. Only overlay text: novel features for TV news broadcast video segmentation. Multimed Tools Appl 81, 30493–30517 (2022). https://doi.org/10.1007/s11042-022-12917-w

Download citation

Received: 05 September 2020
Revised: 20 February 2021
Accepted: 09 March 2022
Published: 06 April 2022
Issue Date: September 2022
DOI: https://doi.org/10.1007/s11042-022-12917-w

Only overlay text: novel features for TV news broadcast video segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Segmenting with style: detecting program and story boundaries in TV news broadcast videos

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

A system for semantic segmentation of TV news broadcast videos

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Only overlay text: novel features for TV news broadcast video segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Segmenting with style: detecting program and story boundaries in TV news broadcast videos

Automatic Segmentation of TV News into Stories Using Visual and Temporal Information

A system for semantic segmentation of TV news broadcast videos

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation