[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

U2-Tree: A Universal Two-Layer Distributed Indexing Scheme for Cloud Storage System

Published: 01 February 2019 Publication History

Abstract

The indices in cloud storage systems manage the stored data and support diverse queries efficiently. Secondary index, the index built on the attributes other than the primary key, facilitates a variety of queries for different purposes. An efficient design of secondary indices is called two-layer indexing scheme. It divides indices in the system into the global index layer and the local index layer. However, previous works on two-layer indexing are mainly on a P2P overlay network. In this paper, we propose U2-Tree, a universal two-layer distributed indexing scheme built on data center networks with tree-like topologies. To construct the U2-Tree, we first build local index according to data features and, then, assign potential indexing range of the global index for each host based on the distribution rule of local data. After that, we use several false positives control techniques, including gap elimination and Bloom filter, to publish meta-data about local index to global index host. In the final step, the global index collects published information and uses tree data structures to organize them. In our design, we take advantage of the topological properties of tree-like topologies, introduce and compare detailed optimization techniques in the construction of two-layer indexing scheme. Furthermore, we discuss the index updating, index tuning, and the fault tolerance of U2-Tree. Finally, we validate the effectiveness and efficiency of U2-Tree by giving a series of theoretical analyses and conducting numerical experiments on Amazon EC2 platform.

References

[1]
M. Al-Fares, A. Loukissas, and A. Vahdat, "A scalable, commodity data center network architecture," ACM SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 63-74, 2008.
[2]
A. Greenberg et al., "VL2: A scalable and flexible data center network," ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 51-62, 2009.
[3]
M. Walraed-Sullivan, A. Vahdat, and K. Marzullo, "Aspen trees: Balancing data center fault tolerance, scalability and cost," in Proc. 9th ACM Conf. Emerg. Netw. Exp. Technol. (CoNEXT), 2013, pp. 85-96.
[4]
V. Liu, D. Halperin, A. Krishnamurthy, and T. E. Anderson, "F10: A fault-tolerant engineered network," in Proc. NSDI, 2013, pp. 399-412.
[5]
Y. Sun, J. Chen, Q. Lu, and W. Fang, "Diamond: An improved fat-tree architecture for large-scale data centers," J. Commun., vol. 9, no. 1, pp. 91-98, 2014.
[6]
G. Chen, Y. Zhao, D. Pei, and D. Li, "Rewiring 2 links is enough: Accelerating failure recovery in production data center networks," in Proc. IEEE 35th Int. Conf. Distrib. Comput. Syst. (ICDCS), Jun./Jul. 2015, pp. 569-578.
[7]
R. Niranjan Mysore et al., "Portland: A scalable fault-tolerant layer 2 data center network fabric," ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 39-50, 2009.
[8]
N. Farrington and A. Andreyev, "Facebook's data center network architecture," in Proc. Opt. Interconnects Conf. (OI), May 2013, pp. 49-50.
[9]
A. Singh et al., "Jupiter rising: A decade of clos topologies and centralized control in Google's datacenter network," in Proc. ACM SIGCOMM Comput. Commun. Rev., vol. 45, no. 4, pp. 183-197, 2015.
[10]
D. Abts, M. R. Marty, P. M. Wells, P. Klausler, and H. Liu, "Energy proportional datacenter networks," ACM SIGARCH Comput. Archit. News, vol. 38, no. 3, pp. 338-347, 2010.
[11]
A. Singla, C.-Y. Hong, L. Popa, and P. B. Godfrey, "Jellyfish: Networking data centers, randomly," in Proc. NSDI, vol. 12, 2012, pp. 1-6.
[12]
C. Guo et al., "BCube: A high performance, server-centric network architecture for modular data centers," ACM SIGCOMM Comput. Commun. Rev., vol. 39, no. 4, pp. 63-74, 2009.
[13]
C. Guo et al., "DCell: A scalable and fault-tolerant network structure for data centers," ACM SIGCOMM Comput. Commun. Rev., vol. 38, no. 4, pp. 75-86, 2008.
[14]
D. Guo et al., "BCN: Expansible network structures for data centers using hierarchical compound graphs," in Proc. IEEE INFOCOM, Apr. 2011, pp. 61-65.
[15]
D. Guo, "Expandable and cost-effective network structures for data centers using dual-port servers," IEEE Trans. Comput., vol. 62, no. 7, pp. 1303-1317, Jul. 2013.
[16]
Z. Li and Y. Yang, "ABCCC: An advanced cube based network for data centers," in Proc. IEEE 35th Int. Conf. Distrib. Comput. Syst. (ICDCS), Jun./Jul. 2015, pp. 547-556.
[17]
D. Li and J. Wu, "On data center network architectures for interconnecting dual-port servers," IEEE Trans. Comput., vol. 64, no. 11, pp. 3210-3222, Nov. 2015.
[18]
F. Li, W. Liang, X. Gao, B. Yao, and G. Chen, "Efficient R-tree based indexing for cloud storage system with dual-port servers," in Database and Expert Systems Applications. Berlin, Germany: Springer, 2014, pp. 375-391.
[19]
X. Gao et al., "FT-INDEX: A distributed indexing scheme for switch-centric cloud storage system," in Proc. IEEE Int. Conf. Commun. (ICC), Jun. 2015, pp. 301-306.
[20]
T. Chen, X. Gao, and G. Chen, "The features, hardware, and architectures of data center networks: A survey," J. Parallel Distrib. Comput., vol. 96, pp. 45-74, Oct. 2016.
[21]
J. Wang, S. Wu, H. Gao, J. Li, and B. C. Ooi, "Indexing multi-dimensional data in a cloud system," in Proc. ACM SIGMOD Int. Conf. Manage. Data, Jun. 2010, pp. 591-602.
[22]
S. Wu, D. Jiang, B. C. Ooi, and K.-L. Wu, "Efficient B-tree based indexing for cloud data processing," in Proc. VLDB Endowment, 2010, vol. 3, nos. 1-2, pp. 1207-1218.
[23]
G. Chen, H. T. Vo, S. Wu, B. C. Ooi, and M. T. Özsu, "A framework for supporting DBMS-like indexes in the cloud," in Proc. VLDB Endowment, 2011, vol. 4, no. 11, pp. 702-713.
[24]
Y. Hong et al., "Efficient R-tree based indexing scheme for server-centric cloud storage system," IEEE Trans. Knowl. Data Eng., vol. 28, no. 6, pp. 1503-1517, Jun. 2016.
[25]
L. Gao, Y. Zhang, X. Gao, and G. Chen, "Indexing multi-dimensional data in modular data centers," in Database and Expert Systems Applications. Berlin, Germany: Springer, 2015, pp. 304-319.
[26]
Y. Zhang, J. Cao, X. Gao, and G. Chen, "FR-Index: A multi-dimensional indexing framework for switch-centric data centers," in Database and Expert Systems Applications. Berlin, Germany: Springer, 2016.
[27]
Y. Liu, X. Gao, and G. Chen, "Design and optimization for distributed indexing scheme in switch-centric cloud storage system," in Proc. IEEE Symp. Comput. Commun. (ISCC), Jul. 2015, pp. 582-587.
[28]
R. Zhang, J. Qi, M. Stradling, and J. Huang, "Towards a painless index for spatial objects," ACM Trans. Database Syst., vol. 39, no. 3, p. 19, 2014.
[29]
B. H. Bloom, "Space/time trade-offs in hash coding with allowable errors," Commun. ACM, vol. 13, no. 7, pp. 422-426, 1970.
[30]
H. Edelsbrunner, Dynamic Data Structures for Orthogonal Intersection Queries (Forschungsberichte). Inst. f. Informationsverarbeitung, TU Graz, 1980. [Online]. Available: https://books.google.co.jp/books?id=iSGHPgAACAAJ
[31]
E. M. McCreight, "Efficient algorithms for enumerating intersecting intervals and rectangles," Xerox Paolo Alto Res. Center, Palo Alto, CA, USA, Tech. Rep. CSL-80-9, 1980.
[32]
J. L. Bentley, "Solutions to Klee's rectangle problems," Carnegie Mellon Univ., Pittsburgh, PA, USA, Tech. Rep., 1977.
[33]
E. M. McCreight, "Priority search trees," SIAM J. Comput., vol. 14, no. 2, pp. 257-276, 1985.
[34]
S. Wu and K.-L. Wu, "An indexing framework for efficient retrieval on the cloud," IEEE Data Eng. Bull., vol. 32, no. 1, pp. 75-82, Jan. 2009.
[35]
M. Mitzenmacher, "Compressed Bloom filters," IEEE/ACM Trans. Netw., vol. 10, no. 5, pp. 604-612, Oct. 2002.
[36]
G. K. Zipf, Selected Studies of the Principle of Relative Frequency in Language. Cambridge, MA, USA: Harvard Univ. Press, 1932.

Cited By

View all
  • (2022)MetisRL: A Reinforcement Learning Approach for Dynamic Routing in Data Center NetworksDatabase Systems for Advanced Applications10.1007/978-3-031-00126-0_44(615-622)Online publication date: 11-Apr-2022
  • (2019)An efficient and scalable multi-dimensional indexing scheme for modular data centersData & Knowledge Engineering10.1016/j.datak.2019.101729123:COnline publication date: 1-Sep-2019

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE/ACM Transactions on Networking
IEEE/ACM Transactions on Networking  Volume 27, Issue 1
February 2019
460 pages

Publisher

IEEE Press

Publication History

Published: 01 February 2019
Published in TON Volume 27, Issue 1

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)MetisRL: A Reinforcement Learning Approach for Dynamic Routing in Data Center NetworksDatabase Systems for Advanced Applications10.1007/978-3-031-00126-0_44(615-622)Online publication date: 11-Apr-2022
  • (2019)An efficient and scalable multi-dimensional indexing scheme for modular data centersData & Knowledge Engineering10.1016/j.datak.2019.101729123:COnline publication date: 1-Sep-2019

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media