Abstract
A Web community, as a significant pattern of the Web, formed by a group of pages focusing on a common topic. Web communities are able to be oriented by complete bipartite graphs (CBG for short, and also known as community cores). Investigations have recently been conducted to fix the community structures of the Web by extracting CBGs. However, they are far away from real communities. Focusing on the issue of automatically ascertaining the ideal sizes of Web communities, we first raise the community cores into initial condition to retrieve complete community structures. With the available of all CBGs, a two-step heuristic algorithm is proposed to specify Web communities. First, the sketches of communities are drawn by gradually merging overlapping communities cores. Then, communities are completed by extending and including highly referred members. Experiments on real and large data collections demonstrate that the proposed algorithm is capable to effectively identify such communities that satisfy: (1) the relationships among the members of intra-communities are close; (2) the boundaries between the inter-communities are sparse.
This work was partially supported by NSFC under grant No. 60873180, 61070016, SRF for ROCS, State Education Ministry, and by the Fundamental Research Funds (DUT10JR02, #1600-893313) for the Central Universities, China.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Douribsboure, Y., Geraci, F., Pellegrimi, M.: Extraction and Classification of Dense Communities in the Web. In: Proceedings of the 16th International Conference on World Wide Web, pp. 461–470. ACM, New York (2007)
Berbers-Lee, T., Hall, W., Hendler, J.A., O’Hara, K., Shadbolt, N., Weitzner, D.J.: A Framework for Web Science. Foundations and Trends in Web Science 1(1), 130–130 (2006)
Berbers-Lee, T., Hall, W., Hendler, J.A., O’Hara, K., Shadbolt, N., Weitzner, D.J.: Creating a Science of the Web. Science 313(5788), 769–770 (2006)
Smith, A., Gerstein, M.: Data Mining on the Web. Science 314(5806), 1682–1682 (2006)
Kleinberg, J., Lawrence, S.: The Structure of the Web. Science 294(5548), 1849–1850 (2001)
Albert, R., Jeong, H., Barabasi, A.L.: Diameter of the World Wide Web. Nature 401(6749), 130–131 (1999)
Watts, D.J., Strogatz, S.H.: Collective dynamics of small-world networks. Nature 393(6684), 440–442 (1998)
Flake, G.W., Lawrence, S., Giles, C.L.: Efficient identification of Web communities. In: Proceedings of the sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 150–160. ACM, New York (2000)
Flake, G.W., Lawrence, S., Giles, C.L., Coetzee, F.M.: Self organization and identification of Web communities. IEEE Computer 35(3), 66–71 (2002)
Gibson, D., Kleinberg, J., Raghavan, P.: Inferring Web communities from link topology. In: Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia: Links, Objects, Time and Space–Structure in Hypermedia Systems: Links, Objects, Time and Space—Structure in Hypermedia Systems, pp. 225–234. ACM, New York (1998)
Kumar, R., Raghavan, P., Rajagopalan, S., Tomkins, A.: Trawling the Web for Emerging Cyber-Communities. Computer Networks 31, 1481–1493 (1999)
Reddy, P.K., Kitsuregawa, M.: An approach to relate the web communities through bipartite graphs. In: Proceedings of the Second International Conference on Web Information Systems Engineering, pp. 7–14. Springer, Berlin (2001)
Zhang, X., Li, Y., Liang, W.: C&C: An Effective Algorithm for Extracting Web Community Cores. In: Proceedings of SNSMW 2010 in Conjunction with the 15th International Conference on Database Systems for Advanced Applications, pp. 316–326 (2010)
Murata, T.: Discovery of Web Communities from Positive and Negative Examples. In: Discovery Science, pp. 369–376. Springer, Berlin (2003)
Davison, B.D.: Topical Locality in the Web. In: Proceedings of the 23rd annual international ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 272–279. ACM, New York (2000)
Chakrabarti, S., Joshi, M.M., Punera, K., Pennock, D.M.: The Structure of Broad Topics on the Web. In: Proceedings of the 11th International Conference on World Wide Web, pp. 251–262. ACM, New York (2002)
Flake, G.W., Pennock, D.M., Fain, D.C.: The self-organized Web: The yin to the Semantic Webs yang. IEEE Intelligent Systems 18(4), 75–77 (2003)
Andersen, R., Lang, K.J.: Communities from seed sets. In: Proceedings of the 15th International Conference on World Wide Web, pp. 223–232. ACM, New York (2006)
Huang, J., Zhu, T., Schuurmans, D.: Web communities identification from random walks. In: Proceedings of 10th European Conference on Principles and Practice of Knowledge Discovery in Databases, pp. 187–198. ACM, New York (2006)
Imafuji, N., Kitsuregawa, M.: Finding a Web community by maximum flow algorithm with HITS score based capacity. In: Database Systems for Advanced Applications, pp. 101–106. Springer, Berlin (2003)
Ino, H., Kudo, M., Nakamura, A.: A Comparative Study of Algorithms for Finding Web Communities. In: Data Engineering Workshops, pp. 1257–1261 (2005)
Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99(12), 7821–7826 (2005)
Balakrishnan, H., Deo, N.: Detecting communities using bibliographic metrics. In: IEEE International Conference on Granular Computing, pp. 293–298. IEEE Computer Society, Washington, DC (2006)
Kannan, R., Vetta, A.: On clusterings: Good, bad and spectral. Journal of the ACM 51(3), 497–515 (2004)
Newman, M.E.J.: Detecting community structure in networks. The European Physical Journal B-Condensed Matter and Complex Systems 38(2), 321–330 (2004)
Mihail, M., Gkantsidis, C., Saberi, A.: On the semantics of Internet topologies. Georgia Institute of Technology, Atlanta (2002)
Boldi, P., Vigna, S.: The webgraph framework I: compression techniques. In: Proceedings of the 13th International Conference on World Wide Web, pp. 595–602. ACM, New York (2004)
Kleinberg, J.M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Leckovec, J., Lang, K.J., Mahoney, M.W.: Empirical Comparision of Alogrithms for Network Community Detection. In: Proceeding of the 19th International Conference on World Wide Web, pp. 631–640. ACM, New York (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhang, X., Wang, L., Li, Y., Liang, W. (2011). Detection of Web Communities from Community Cores. In: Chiu, D.K.W., et al. Web Information Systems Engineering – WISE 2010 Workshops. WISE 2010. Lecture Notes in Computer Science, vol 6724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24396-7_28
Download citation
DOI: https://doi.org/10.1007/978-3-642-24396-7_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24395-0
Online ISBN: 978-3-642-24396-7
eBook Packages: Computer ScienceComputer Science (R0)