Abstract
Media content distribution systems make extensive use of computational resources, such as disk and network bandwidth. The use of these resources is proportional to the relative popularity of the objects and their level of replication over time. Therefore, understanding request popularity over time can inform system design decisions. As well, advertisers can target popular objects to maximize their impact.
Workload characterization is especially challenging with user-generated content, such as in YouTube, where popularity is hard to predict a priori and content is uploaded at a very fast rate. In this paper, we consider category as a distinguishing feature of a video and perform an extensive analysis of a snapshot of videos uploaded over two 24-h periods. Our results show significant differences between categories in the first 149 days of the videos’ lifetimes. The lifespan of videos, relative popularity and time to reach peak popularity clearly differentiate between news/sports and music/film. Predicting popularity is a challenging task that requires sophisticated techniques (e.g. time-series clustering). From our analysis, we develop a workload generator that can be used to evaluate caching, distribution and advertising policies. This workload generator matches the empirical data on a number of statistical measurements.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
As defined by the uploader.
- 2.
https://developers.google.com/youtube/2.0/reference#YouTube_Category_List. Last accessed: 09-05-13.
- 3.
We used another crawler to collect categories for the videos which remained.
- 4.
Added views is the number of views on a particular day
- 5.
Sports is 0.99 for the first day’s views and the rest of the measurement period
References
Abhari, A., Soraya, M.: Workload generation for YouTube. Multimed. Tools Appl. 46(1), 91–118 (2010)
Borghol, Y., Mitra, S., Ardon, S., Carlsson, N., Eager, D., Mahanti, A.: Characterizing and modelling popularity of user-generated videos. Perform. Eval. 68, 1037–1055 (2011)
Brodersen, A., Scellato, S., Wattenhofer, M.: YouTube around the world: geographic popularity of videos. In: WWW, Lyon, France, pp. 241–250, April 2012
Broxton, T., Interian, Y., Vaver, J., Wattenhofer, M.: Catching a viral video. In: IEEE Data Mining Workshops, Sydney, Australia, pp. 296–304, December 2010
Cha, M., Kwok, H., Rodriguez, P., Ahn, Y., Moon, S.: Analyzing the video popularity characteristics of large-scale user generated content systems. IEEE/ACM Trans. Netw. 17(5), 1357–1370 (2009)
Cheng, X., Dale, C., Liu, J.: Understanding the characteristics of internet short video sharing: YouTube as a case study. Technical report, Cornell University, arXiv e-prints (July 2007)
Chu, K.K.W., Wong, M.H.: Fast time-series searching with scaling and shifting. In: ACM PODS, Philadelphia, PA, pp. 237–248, May 1999
Ding, Y., Du,Y., Hu, Y., Liu, Z., Wang, L., Ross, K., Ghose, A.: Broadcast yourself: understanding YouTube uploaders. In: ACM IMC, Berlin, Germany, pp. 361–370, November 2011
Figueiredo, F., Benevenuto, F., Almeida, J.: The tube over time: characterizing popularity growth of YouTube videos. In: ACM WSDM, Hong Kong, China, pp. 745–754, February 2011
Gember, A., Anand, A., Akella, A.: A comparative study of handheld and non-handheld traffic in campus Wi-Fi networks. In: Spring, N., Riley, G.F. (eds.) PAM 2011. LNCS, vol. 6579, pp. 173–183. Springer, Heidelberg (2011)
Gill, P., Arlitt, M., Li, Z., Mahanti, A.: YouTube traffic characterization: a view from the edge. In: ACM IMC, San Diego, CA, pp. 15–28, October 2007
Gummadi, K.P., Dunn, R.J., Saroiu, S., Gribble, S.D., Levy, H.M., Zahorjan, J.: Measurement, modeling, and analysis of a peer-to-peer file-sharing workload. In: ACM SOSP, Bolton Landing, NY, pp. 314–329, October 2003
Khemmarat, S., Zhou, R., Gao, L., Zink, M.: Watching user generated videos with prefetching. In: ACM MMSYS, San Jose, CA, pp. 187–198, February 2011
Labovitz, C., Iekel-Johnson, S., McPherson, D., Oberheide, J., Jahanian, F.: Internet inter-domain traffic. In: ACM SIGCOMM, New Delhi, India, pp. 75–86, August 2010
Maier, G., Schneider, F., Feldmann, A.: A first look at mobile hand-held device traffic. In: Krishnamurthy, A., Plattner, B. (eds.) PAM 2010. LNCS, vol. 6032, pp. 161–170. Springer, Heidelberg (2010)
Siersdorfer, S., Chelaru, S., Nejdl, W., San Pedro, J.: How useful are your comments?: analyzing and predicting YouTube comments and comment ratings. In: WWW, Raleigh, NC, pp. 891–900, April 2010
Szabo, G., Huberman, B.: Predicting the popularity of online content. CACM 53(8), 80–88 (2010)
Yang, J., Leskovec, J.: Patterns of temporal variation in online media. In: ACM WSDM, Hong Kong, China, pp. 177–186, February 2011
Zink, M., Suh, K., Gu, Y., Kurose, J.: Characteristics of YouTube network traffic at a campus network - measurements, models, and implications. Comput. Netw. 53(4), 501–514 (2009)
Acknowledgements
The authors would like to acknowledge the support of the University of Saskatchewan’s Dean’s Scholarship Program.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chowdhury, S.A., Makaroff, D. (2014). Category-Based YouTube Request Pattern Characterization. In: Krempels, KH., Stocker, A. (eds) Web Information Systems and Technologies. WEBIST 2013. Lecture Notes in Business Information Processing, vol 189. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44300-2_10
Download citation
DOI: https://doi.org/10.1007/978-3-662-44300-2_10
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44299-9
Online ISBN: 978-3-662-44300-2
eBook Packages: Computer ScienceComputer Science (R0)