[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

A Framework for supporting DBMS-like indexes in the cloud

Published: 01 August 2011 Publication History

Abstract

To support "Database as a service" (DaaS) in the cloud, the database system is expected to provide similar functionalities as in centralized DBMS such as efficient processing of ad hoc queries. The system must therefore support DBMS-like indexes, possibly a few indexes for each table to provide fast location of data distributed over the network. In such a distributed environment, the indexes have to be distributed over the network to achieve scalability and reliability. Each cluster node maintains a subset of the index data. As in conventional DBMS, indexes incur maintenance overhead and the problem is more complex in the distributed environment since the data are typically partitioned and distributed based on a subset of attributes. Further, the distribution of indexes is not straight forward, and there is therefore always the question of scalability, in terms of data volume, network size, and number of indexes.
In this paper, we examine the problem of providing DBMS-like indexing mechanisms in cloud DaaS, and propose an extensible, but simple and efficient indexing framework that enables users to define their own indexes without knowing the structure of the underlying network. It is also designed to ensure the efficiency of hopping between cluster nodes during index traversal, and reduce the maintenance cost of indexes. We implement three common indexes, namely distributed hash indexes, distributed B+-tree-like indexes and distributed multi-dimensional indexes, to demonstrate the usability and effectiveness of the framework. We conduct experiments on Amazon EC2 and an in-house cluster to verify the efficiency and scalability of the framework.

References

[1]
P. Agrawal, A. Silberstein, B. F. Cooper, U. Srivastava, and R. Ramakrishnan. Asynchronous view maintenance for vlsd databases. In SIGMOD, pages 179--192, 2009.
[2]
M. K. Aguilera, W. Golab, and M. A. Shah. A practical scalable distributed b-tree. PVLDB, 1(1):598--609, 2008.
[3]
Y. Cao, C. Chen, F. Guo, D. Jiang, Y. Lin, B. C. Ooi, H. T. Vo, S. Wu, and Q. Xu. Es2: A cloud data storage system for supporting both oltp and olap. In ICDE, pages 291--302, 2011.
[4]
M. Datar, N. Immorlica, P. Indyk, and V. S. Mirrokni. Locality-sensitive hashing scheme based on p-stable distributions. In SCG, pages 253--262, 2004.
[5]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In SOSP, pages 205--220, 2007.
[6]
C. Gotsman and M. Lindenbaum. On the metric properties of discrete space-filling curves. IEEE Transactions on Image Processing, 5(5):794--797, 1996.
[7]
H. V. Jagadish, B. C. Ooi, and Q. H. Vu. Baton: a balanced tree structure for peer-to-peer networks. In VLDB, pages 661--672, 2005.
[8]
A. Lakshman and P. Malik. Cassandra: structured storage system on a p2p network. In PODC, pages 5--5, 2009.
[9]
P. L. Lehman and S. B. Yao. Efficient locking for concurrent operations on b-trees. ACM Trans. Database Syst., 6:650--670, 1981.
[10]
J. J. Levandoski, D. B. Lomet, M. F. Mokbel, and K. Zhao. Deuteronomy: Transaction support for cloud data. In CIDR, pages 123--133, 2011.
[11]
M. Lupu, B. C. Ooi, and Y. C. Tay. Paths to stardom: calibrating the potential of a peer-based data management system. In SIGMOD, pages 265--278, 2008.
[12]
W. S. Ng, B. C. Ooi, K.-L. Tan, and A. Zhou. Peerdb: A p2p-based system for distributed data sharing. In ICDE, pages 633--644, 2003.
[13]
J. Rao, E. J. Shekita, and S. Tata. Using paxos to build a scalable, consistent, and highly available datastore. PVLDB, 4(4):243--254, 2011.
[14]
S. Ratnasamy, P. Francis, M. Handley, R. Karp, and S. Shenker. A scalable content-addressable network. In SIGCOMM, pages 161--172, 2001.
[15]
S. Seshadri and J. F. Naughton. Sampling issues in parallel database systems. In EDBT, pages 328--343, 1992.
[16]
A. Silberstein, B. F. Cooper, U. Srivastava, E. Vee, R. Yerneni, and R. Ramakrishnan. Efficient bulk insertion into a distributed ordered table. In SIGMOD, pages 765--778, 2008.
[17]
I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw., 11(1):17--32, 2003.
[18]
H. T. Vo, C. Chen, and B. C. Ooi. Towards elastic transactional cloud storage with range query support. PVLDB, 3(1):506--517, 2010.
[19]
J. Wang, S. Wu, H. Gao, J. Li, and B. C. Ooi. Indexing multi-dimensional data in a cloud system. In SIGMOD, pages 591--602, 2010.
[20]
S. Wu, D. Jiang, B. C. Ooi, and K.-L. Wu. Efficient b-tree based indexing for cloud data processing. PVLDB, 3(1):1207--1218, 2010.

Cited By

View all

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 4, Issue 11
August 2011
520 pages

Publisher

VLDB Endowment

Publication History

Published: 01 August 2011
Published in PVLDB Volume 4, Issue 11

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2020)A multidimensional index for range queries over Cayley‐based DHTTransactions on Emerging Telecommunications Technologies10.1002/ett.394431:12Online publication date: 22-Dec-2020
  • (2019)U2-TreeIEEE/ACM Transactions on Networking10.1109/TNET.2019.289100827:1(201-213)Online publication date: 1-Feb-2019
  • (2019)DeStagerDistributed and Parallel Databases10.1007/s10619-018-7235-337:1(209-231)Online publication date: 1-Mar-2019
  • (2018)-Tree: An Efficient Indexing Scheme for Server-Centric Data Center NetworksDatabase and Expert Systems Applications10.1007/978-3-319-98809-2_15(232-247)Online publication date: 3-Sep-2018
  • (2016)MIDAS: A Middleware to Provide Interoperability between SaaS and DaaSProceedings of the XII Brazilian Symposium on Information Systems on Brazilian Symposium on Information Systems: Information Systems in the Cloud Computing Era - Volume 110.5555/3021955.3022023(401-408)Online publication date: 17-May-2016
  • (2016)Holistic Shuffler for the Parallel Processing of SQL Window FunctionsDistributed Applications and Interoperable Systems10.1007/978-3-319-39577-7_6(75-81)Online publication date: 6-Jun-2016
  • (2014)ScalaGiSTProceedings of the VLDB Endowment10.14778/2733085.27330877:14(1797-1808)Online publication date: 1-Oct-2014
  • (2014)A Multi-dimensional Index Structure Based on Improved VA-file and CAN in the CloudInternational Journal of Automation and Computing10.1007/s11633-014-0772-y11:1(109-117)Online publication date: 1-Feb-2014
  • (2013)Database research at the National University of SingaporeACM SIGMOD Record10.1145/2503792.250380342:2(46-51)Online publication date: 16-Jul-2013
  • (2012)Only aggressive elephants are fast elephantsProceedings of the VLDB Endowment10.14778/2350229.23502725:11(1591-1602)Online publication date: 1-Jul-2012
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media