Abstract
Social networking services support millions of users who interact with one another on a regular basis and generate substantial amounts of data. Due to the inherently distributed structure of such networks and the possible remoteness of the users, the data involved must be partitioned into shards and distributed over a number of servers. One of the most important functionalities of a social networking platform is to process queries related, not only to a given users data but also to the users acquaintances. This suggests that a competent sharding algorithm for a distributed social database must make use of the social network’s topology. We describe algorithms that utilize the structure of social networks to prepare shards that result in better query performance, lower network utilization and better load balancing.
Similar content being viewed by others
References
Banerjee A, Basu S (2008) A social query model for decentralized search. In: Proceedings of the 2nd workshop on social network mining and analysiss, vol 124. ACM, New York
Dataset (2013) Twitter Dataset. https://github.com/gephi/gephi/wiki/Datasets. Accessed 13 Jan 2015
Duong Q, Goel S, Hofman J, Vassilvitskii S (2013) Sharding social networks. In: Proceedings of the sixth ACM international conference on Web search and data mining. ACM, pp 223–232
Hagberg AA, Schult DA, Swart PJ (2008) Exploring network structure, dynamics, and function using NetworkX. In: Proceedings of the 7th Python in science conference (SciPy2008), Pasadena, CA, USA, pp 11–15
Horowitz D, Kamvar SD (2010) The anatomy of a large-scale social search engine. In: Proceedings of the 19th international conference on World wide web. ACM, pp 431–440
Jain R, Durresi A, Babic G (1999) Throughput fairness index: an explanation. Technical report, Department of CIS, The Ohio State University
Karagiannis T, Gkantsidis C, Narayanan D, Rowstron A (2010) Hermes: clustering users in large-scale e-mail services. In: Proceedings of the 1st ACM symposium on cloud computing. ACM, pp 89–100
Karp RM (1972) Reducibility among combinatorial problems. In: Miller RE, Thatcher JW, Bohlinger JD (eds) Complexity of computer computations. Springer, New York, pp 85–103
Karypis G, Kumar V (1995) Metis—unstructured graph partitioning and sparse matrix ordering system, version 2.0. Technical report
Karypis G, Kumar V (1998) Multilevelk-way partitioning scheme for irregular graphs. J Parallel Distrib Comput 48(1):96–129
Leskovec J, Huttenlocher D, Kleinberg J (2010) Signed networks in social media. In: Proceedings of the SIGCHI conference on human factors in computing system. ACM, pp 1361–1370
Pujol JM, Erramilli V, Siganos G, Yang X, Laoutaris N, Chhabra P, Rodriguez P (2011) The little engine(s) that could: scaling online social networks. ACM SIGCOMM Comput Commun Rev 41(4):375–386
Tarjan R (1972) Depth-first search and linear graph algorithms. SIAM J Comput 1(2):146–160
Tomita E, Tanaka A, Takahashi H (2006) The worst-case time complexity for generating all maximal cliques and computational experiments. Theor Comput Sci 363(1):28–42
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Bhat, P.T., Thankachan, R.V. & Chandrasekaran, K. Sharding distributed social databases using social network analysis. Soc. Netw. Anal. Min. 5, 31 (2015). https://doi.org/10.1007/s13278-015-0274-0
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s13278-015-0274-0