[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

Identifying website communities in mobile internet based on affinity measurement

Published: 01 March 2014 Publication History

Abstract

With the rapid development of mobile devices and wireless technologies, mobile internet websites play an essential role for delivering networked services in our daily life. Thus, identifying website communities in mobile internet is of theoretical and practical significance in optimizing network resource and improving user experience. Existing solutions are, however, limited to retrieve website communities based on hyperlink structure and content similarities. The relationships between user behaviors and community structures are far from being understood. In this paper, we develop a three-step algorithm to extract communities by affinity measurement derived from user accessing information. Through experimental evaluation with massive detailed HTTP traffic records captured from a cellular core network by high performance monitoring devices, we show that our affinity measurement based method is effective in identifying hidden website communities in mobile internet, which have evaded previous link-based and content-based approaches.

References

[1]
Cisco Visual Networking Index: Forecast and Methodoloy, 2012-2107; <http://www.cisco.com/en/US/solutions/collateral/ns341/ns525/ns537/ns705/ns827/white_paper_c11-481360.pdf>.
[2]
Sandvine Global Internet Phenomena Report, 2H 2012; <http://www.sandvine.com/downloads/documents/Phenomena_2H_2012/Sandvine_Global_Internet_Phenomena_Report_2H_2012.pdf>.
[3]
Han, T., Ansari, N., Wu, M. and Yu, H., On accelerating content delivery in mobile networks. IEEE Commun. Surv. Tutor. v3 i3. 1314-1333.
[4]
Zhang, Y., Ansari, N., Wu, M. and Yu, H., On wide area network optimization. IEEE Commun. Surv. Tutor. v14 i4. 1090-1113.
[5]
Cheng, G., Ansari, N. and Papavassiliou, S., Adaptive QoS provisioning by pricing incentive QoS routing for next generation networks. Comput. Commun. v31 i10. 2308-2318.
[6]
Kobayashi, M., Nakayama, H., Ansari, N. and Kato, N., Reliable application layer multicast over combined wired and wireless networks. IEEE Trans. Multimedia. v11 i8. 1466-1477.
[7]
Zhang, J. and Ansari, N., On assuring end-to-end QoE in next generation networks: challenges and a possible solution. IEEE Commun. Mag. v49 i7. 185-192.
[8]
Dorogovtsev, S.N. and Mendes, J.F., Evolution of Networks: From Biological Nets to the Internet and WWW. 2003. Oxford University Press.
[9]
C. Chen, Structuring and visualising the www by generalised similarity analysis, in: Proceedings of the Eighth ACM Conference on Hypertext, 1997, pp. 177-186.
[10]
S. Mukherjea, Y. Hara, Focus+ context views of World-Wide Web nodes, in: Proceedings of the Eighth ACM Conference on Hypertext, 1997, pp. 187-196.
[11]
J.M. Kleinberg, Authoritative sources in a hyperlinked environment, in: Proceedings of the Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, 1998, pp. 668-677.
[12]
Spertus, E., ParaSite: mining structural information on the web. Comput. Networks ISDN Syst. v29 i8. 1205-1215.
[13]
F. Menczer, Links tell us about lexical and semantic web content, 2001. arXiv:cs/0108004.
[14]
Wang, Y. and Kitsuregawa, M., On combining link and contents information for web page clustering. 2002. Springer, Berlin, Heidelberg.
[15]
D. Gibson, J. Kleinberg, P. Raghavan, Inferring web communities from link topology, in: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia, 1998, pp. 225-234.
[16]
G.W. Flake, S. Lawrence, C.L. Giles, Efficient identification of web communities, in: Proceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2000, pp. 150-160.
[17]
J.J. Merelo-Guervs, B. Prieto, A. Prieto, G. Romero, P.C. Valdivieso, Clustering web-based communities using self-organizing maps. In: IADIS International Conference Web Based Communities, 2004.
[18]
M. Ester, H.P. Kriegel, M. Schubert, Web site mining: a new way to spot competitors, customers and suppliers in the world wide web, in: Proceedings of the Eighth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002, pp. 249-258.
[19]
F. Ricca, P. Tonella, C. Girardi, E. Pianta, An empirical study on keyword-based web site clustering, in: Proceedings of the 12th IEEE International Workshop on Program Comprehension, 2004, pp. 204-213.
[20]
P. Tonella, F. Ricca, E. Pianta, C. Girardi, Using keyword extraction for web site clustering. in: Proceedings of the Fifth IEEE International Workshop on Web Site, Evolution, 2003, pp. 41-48.
[21]
Kriegel, H.P. and Schubert, M., Classification of websites as sets of feature vectors. Databases Appl. 127-132.
[22]
E. Meneses, Vectors and graphs: two representations to cluster web sites using hyperstructure, in: IEEE Web Congress, LA-Web'06. Fourth Latin American, 2006, pp. 172-178.
[23]
Kangasharju, J., Roberts, J. and Ross, K.W., Object replication strategies in content distribution networks. Comput. Commun. v25 i4. 376-383.
[24]
Tan, P.N., Introduction to Data Mining. 2007. Pearson Education India.
[25]
V. Kumar, An introduction to cluster analysis for data mining, Technical report, University of Minnesota, USA CS Dept., 2000.
[26]
Zhang, B. and Horvath, S., A general framework for weighted gene co-expression network analysis. Stat. Appl. Genet. Mol. Biol. v4 i1. 1128
[27]
Broder, A., Kumar, R., Maghoul, F.P., Raghavan, P., Rajagopalan, S., Stata, R., Tomkins, A. and Wiener, J., Graph structure in the web. Comput. networks. v33 i1. 309-320.
[28]
Faloutsos, M., Faloutsos, P. and Faloutsos, C., On power-law relationships of the internet topology. ACM SIGCOMM Comput. Commun. Rev. v29 i4. 251-262.
[29]
Pinto, C., Mendes, L.A. and Machado, J.A., A review of power laws in real life phenomena. Commun. Nonlinear Sci. Numer. Simul. v17 i9. 3558-3578.
[30]
A set of measures of centrality based on betweenness. Sociometry. 35-41.
[31]
Newman, M., Networks: An Introduction. 2009. Oxford University Press.
[32]
L. Page, S. Brin, R. Motwani, T. Winograd, The PageRank citation ranking: bringing order to the web, Technical report, Computer Science Department, Stanford University, 1999.
[33]
Apache open source project hadoop, URL <http://hadoop.apache.org/>.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Computer Communications
Computer Communications  Volume 41, Issue
March, 2014
94 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 March 2014

Author Tags

  1. Affinity measurement
  2. Degree distribution
  3. Graph theory
  4. Website community

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2018)Exploration of Web Page Structural Patterns Based on Request Dependency Graph DecompositionInternational Journal of Digital Crime and Forensics10.4018/IJDCF.20161001018:4(1-13)Online publication date: 13-Dec-2018
  • (2018)Revealing connectivity structural patterns among web objects based on co-clustering of bipartite request dependency graphWireless Networks10.1007/s11276-016-1345-524:2(439-451)Online publication date: 1-Feb-2018
  • (2016)Analysis of topology dynamics for unstructured P2P networksComputer Communications10.1016/j.comcom.2016.01.00980:C(72-81)Online publication date: 15-Apr-2016
  • (2016)Context-aware Android applications through transportation mode detection techniquesWireless Communications & Mobile Computing10.1002/wcm.270216:16(2523-2541)Online publication date: 1-Nov-2016

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media