[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article

LH*RS---a highly-available scalable distributed data structure

Published: 01 September 2005 Publication History

Abstract

LH*RS is a high-availability scalable distributed data structure (SDDS). An LH*RS file is hash partitioned over the distributed RAM of a multicomputer, for example, a network of PCs, and supports the unavailability of any k ≥ 1 of its server nodes. The value of k transparently grows with the file to offset the reliability decline. Only the number of the storage nodes potentially limits the file growth. The high-availability management uses a novel parity calculus that we have developed, based on Reed-Salomon erasure correcting coding. The resulting parity storage overhead is about the lowest possible. The parity encoding and decoding are faster than for any other candidate coding we are aware of. We present our scheme and its performance analysis, including experiments with a prototype implementation on Wintel PCs. The capabilities of LH*RS offer new perspectives to data intensive applications, including the emerging ones of grids and of P2P computing.

Supplementary Material

Litwin Appendix (p769-litwin-apndx.pdf)
Online appendix to designing mediation for context-aware applications. The appendix supports the information on page 769.

References

[1]
Alvarez, G., Burkhard, W., and Cristian, F. 1997. Tolerating multiple failures in RAID Architecture with Optimal Storage and Uniform Declustering. In International Symposium on Computer Architecture, ISCA-97, 62--72.]]
[2]
Anderson, D. and Kubiatowicz, J. 2002. The Worldwide Computer. In Scientific American 286, 3, March.]]
[3]
The Boxwood Project. http://research.microsoft.com/research/sv/Boxwood/.]]
[4]
Bartalos, G. 1999. Internet: D-day at eBay. Yahoo INDIVIDUAL INVESTOR ONLINE, (July 19).]]
[5]
Bertino, E., Ooi, B. C., Sacks-Davis, R., Tan, K. L., Zobel, J., Shidlovsky, B., and Catania, B. 1999. Indexing Techniques for Advanced Database Systems. Kluver.]]
[6]
Bennour, F., Di&enegrave;, A., Ndiaye, Y., and Litwin, W. 2000. Scalable and distributed linear hashing LH*LH under Windows NT. In SCI-2000 (Systemics, Cybernetics, and Informatics), Orlando, Florida.]]
[7]
Bennour, F. 2002. Performance of the SDDS LH*LH under SDDS-2000. In Distributed Data and Structures 4 (Proceedings of WDAS 2002), Carleton Scientific, 1--12.]]
[8]
Burkhard, W. A. and Menon, J. 1993. Disk array storage system reliability. In Proceedings of the 22nd International Symposium on Fault Tolerant Computing, Toulouse, 432--441.]]
[9]
Ben-Gan, I., and Moreau, T. Advanced Transact-SQL For SQL Server 2000. 2003. Apress, ISBN 1-8931115-82-8.]]
[10]
Breitbart, Y., Vingralek, R., and Weikum, G. 1996. Load control in scalable distributed file structures. Distributed and Parallel Databases 4, 4, 319--354.]]
[11]
Breitbart, Y. and Vingralek, R. 1998. Addressing and balancing issues in distributed B+ trees. In 1st Workshop on Distributed Data and Structures (WDAS '98), Carleton-Scientific.]]
[12]
Com. ACM. 1997. Special Issue on High-Performance Computing (Oct).]]
[13]
CERIA Home page: http://ceria.dauphine.fr/]]
[14]
http://www.contingencyplanningresearch.com/cod.htm. 1996. Cost of a downtime Study.]]
[15]
Dingledine, R., Freedman, M., and Molnar, D. 2000. The free haven project: Distributed anonymous storage service. Workshop on Design Issues in Anonymity and Unobservability (July).]]
[16]
Donoghue, A. Boldly 2003. Googling into the future. http://insight.zdnet.co.uk/internet/ecommerce/0,39020454,39116781,00.htm]]
[17]
Economist 2003. Moving up the stack. www.economist.com.May.]]
[18]
Gribble, S., Brewer, E., A., Hellerstein, J., and Culler, D. 2000. Scalable, Distributed Data Structures for Internet Service Construction. 4th Symposium on Operating Systems Design and Implementation (OSDI'00).]]
[19]
Gray, J., Szalay, A. S., Ihakar, A., Kunszt, P. S., Stoughton, C., Slutz, D. R., and van den Berg, J. 2002. Data Mining of SDDS SkyServer Database. International Workshop on Distributed Data Structures, (WDAS'02), Carleton Scientific.]]
[20]
Haskin, R. and Schmuck, F. 1996. The Tiger Shark File System. COMPCON-96, 1996.]]
[21]
Hellerstein, L., Gibson, G., Karp, R., Katz, R., and Patterson, R. 1994. Coding techniques for handling failures in large disk arrays. Algorithmica, vol. 12, p. 182--208.]]
[22]
Knuth, D. 1998. The art of computer programming. Vol. 3 Sorting and searching. 2nd Ed. Addison-Wesley, 780.]]
[23]
Kubiatowicz, J. 2003. Extracting guarantees from chaos. In Communications of the ACM, 46, 2, Feb.]]
[24]
Karlson, J., Litwin, W., and Risch, T. 1996. LH*LH: A scalable high performance data structure for switched multicomputers. In Apers, P., Gardarin, G., Bouzeghoub, M., (eds.) Extending Database Technology, EDBT96, Lecture Notes in Computer Science, vol. 1057. Springer Verlag.]]
[25]
Lindberg, R. 1997. A Java Implementation of a Highly Available Scalable and Distributed Data Structure LH*g. Master Th. LiTH-IDA-Ex-97/65. U. Linkoping, 1997/62.]]
[26]
Litwin, W. 1994. Linear hashing: A new tool for file and table addressing. Reprinted from VLDB80 in Readings in Databases, edited by M. Stonebraker, 2nd Edition, Morgan Kaufmann Publishers.]]
[27]
Litwin, W. 1980. Linear hashing: A new algorithm for files and tables addressing. International Conference on Databases. Aberdeen, Heyden, p. 260--275.]]
[28]
Litwin, W., Neimat, M.-A. Levy, G., Ndiaye, S., and Seck, T. 1997. LH*S: A high-availability and high-security Scalable Distributed Data Structure. IEEE-Res. Issues in Data Eng. (RIDE-97).]]
[29]
Litwin, W., Menon J., and Risch, T. 1998. LH* with Scalable Availability. IBM Almaden Res. Rep. RJ 10121 (91937), (May).]]
[30]
Litwin, W., Menon, J., Risch, T., and Schwarz, T. 1999. Design Issues For Scalable Availability LH* Schemes with Record Grouping. DIMACS Workshop on Distributed Data and Structures, Princeton U. Carleton Scientific.]]
[31]
Litwin, W., Moussa, R., and Schwarz, T., 2004a. LH*RS: A Highly Available Distributed Data Storage System. Research Prototype Demonstration. VLDB Toronto.]]
[32]
Litwin, W., Moussa, R., and Schwarz, T. 2004b. LH*RS: A Highly Available Distributed Data Storage System. CERIA Tech. Rep. (Dec).]]
[33]
Litwin, W., Neimat, M.-A., and Schneider, D. 1993. Linear Hashing for Distributed Files. ACM-SIGMOD International Conference on Management of Data.]]
[34]
Litwin, W., Neimat, M.-A., and Schneider, D. 1996. A Scalable Distributed Data Structure. ACM Trans. Datab. Syst., Dec.]]
[35]
Litwin, W. and Neimat, M.-A. 1996. High-Availability LH* Schemes with Mirroring, International Conference on Cooperating Information Systems, (COOPIS) IEEE Press.]]
[36]
Litwin, W. and Risch, T. 1997. LH*g: A High-availability Scalable Distributed Data Structure through Record Grouping. Res. Rep. CERIA, U. Dauphine and U. Linkoping (May).]]
[37]
Litwin, W. and Risch, T. 2001. LH*g: A high-availability scalable distributed data structure by record grouping. IEEE Trans. Knowl. Data Eng. 14, 4, 923--927.]]
[38]
Litwin, W. and Schwarz T. 2000. LH*RS: A high-availability scalable distributed data structure using Reed Solomon codes. ACM-SIGMOD International conference on Management of Data.]]
[39]
Litwin, W. and Risch, T. 2002. LH*g : A High-availability scalable distributed data structure by record grouping. IEEE Trans. Knowl. Data Eng. 14, 4, July/Aug.]]
[40]
Ljungstr&omuml;, M. 2000. Implementing LH*RS: A scalable distributed highly-available data structure, Master Thesis, Feb., CS Dep. U. Linkoping, Sweden.]]
[41]
Luby, M., Mitzenmacher, M., Shokrollahi, M., Spielman, D., and Stemann, V. 1997. Practical Loss-Resilient Codes, STOC 97, Proceedings of the twenty-ninth annual ACM Symposium on Theory of Computing, El Paso, TX, 150--159.]]
[42]
MaCwilliams, F. J. and Sloane, N. J. A. 1997. The Theory of Error Correcting Codes. Elsevier/North Holland, Amsterdam.]]
[43]
Moussa, R. 2003. In Distributed Data and Structures 4, Carleton Scientific (Records of WDAS 2002, Paris).]]
[44]
Moussa, R. 2004. Experimental Performance Analysis of LH*RS. CERIA Res. Rep. {CERIA}.]]
[45]
Moussa, R. and Litwin, W. 2002. Experimental performance analysis of LH*RS parity management. Distributed Data and Structures 4, Records of the 4th International Meeting (WDAS 2002), Paris, France.]]
[46]
Pâris, J. F. 1993. The management of replicated data. In Proceedings of the Workshop on Hardware and Software Architectures for Fault Tolerance. Mt. St. Michel, Fr. June.]]
[47]
Ramakrishnan, R. 1999. Database Management Systems. McGraw Hill.]]
[48]
RFC 793---Transmission Control Protocol http://www.faqs.org/rfcs/rfc793.html]]
[49]
Sabaratnam M., Torbjornsen, and Hvasshovd, S.-O. 1999. Evaluating the effectiveness of fault tolerance in replicated database management systems. 29th. Annual Interantional Symposium on Fault Tolerant Computing.]]
[50]
Schwarz, T. 2003. Generalized Reed Solomon Codes for Erasure Correction in SDDS. Workshop on Distributed Data and Structure 4, WDAS-4, Paris. Carleton Scientific.]]
[51]
SDDS-Bibliography. http://192.134.119.81/SDDS-bibliograhie.html, http://ceria.dauphine.fr/witold.html]]
[52]
Vingralek, R., Breitbart, Y., Weikum, G. and Snowball. 1998. Scalable storage on networks of workstations. Distributed and Parallel Databases 6, 2, 117--156.]]
[53]
Weatherspoon, H. and Kubiatowicz, J. 2002. Erasure coding vs. replication: A quantitative comparison. 1st International Workshop on Peer-to-Peer systems, IPTPS-2002. March.]]
[54]
Xin, Q., Miller, E., Schwarz, T., Brandt, S., Long, D., Litwin, W. 2003. Reliability mechanisms for very large storage systems. 20th IEEE mass storage systems and technologies (MSST 2003), San Diego, CA. 146--156.]]
[55]
Xin, Q., Miller, E., and Schwarz, T. 2004. Evaluation of distributed recovery in large-scale storage systems. In 13th IEEE International Symposium on High Performance Distributed Computing (HPDC'04), Honolulu, HI.]]

Cited By

View all
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)A Highly Available Database System Based on Load Balancing Functionality2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA)10.1109/ICIPCA61593.2024.10709101(841-844)Online publication date: 28-Jun-2024
  • (2023)Benchmarking Hashing Algorithms for Load Balancing in a Distributed Database EnvironmentAdvances in Model and Data Engineering in the Digitalization Era10.1007/978-3-031-23119-3_8(105-118)Online publication date: 10-Jan-2023
  • Show More Cited By

Recommendations

Reviews

Jesus Villadangos

This paper continues the authors' research on scalable distributed data structures. In this work, the authors focus their attention on data availability, and extend their previous proposal, LH*, to provide high availability. An LH* server can be unavailable, in which case access to data becomes impossible; this may not be acceptable for an application. The paper considers two issues: algorithm performance and high availability. On one side, the proposal applies Reed-Solomon codes with a Galois field (GF) (216) for parity calculus. It refines the parity matrix, and introduces the concept of a generic parity matrix. The authors also implement the GF multiplication using logarithms and antilogarithms. In addition, they pay special attention to the efficiency of the communication architecture. Such considerations support improving the performance of the algorithm. On the other side, the paper proposes the use of parity records, which are invisible to applications, and provide high availability for the application. The architecture considers a set of LH*RS servers that are contacted by the application only through the LH*RS clients. Such clients are localized in the application node. Clients are responsible for data search or storage over the network. Applications ask the clients for data. There is another important component in the architecture: the coordinator. The paper describes the operation of such a coordinator to provide high-availability updating, insertion, and deletion of data in the system, and bucket split and merging. The paper also describes how the coordinator performs the above operations to derive recovery responsibilities for the LH*RS clients and servers; this avoids the coordinator being a hot spot in the system. According to paper, the coordinator is k-replicated. However, the resolution of coordinator failure is not clearly explained. The paper provides some interesting evaluations of the proposal, which support the conclusion that the authors have proposed a new highly available scalable distributed data structure that can be applied in current systems (including peer-to-peer networks, replication systems, and databases). Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Transactions on Database Systems
ACM Transactions on Database Systems  Volume 30, Issue 3
September 2005
226 pages
ISSN:0362-5915
EISSN:1557-4644
DOI:10.1145/1093382
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2005
Published in TODS Volume 30, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. P2P
  2. Scalable distributed data structure
  3. grid computing
  4. high-availability
  5. linear hashing
  6. physical database design

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)3
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)A Highly Available Database System Based on Load Balancing Functionality2024 IEEE 2nd International Conference on Image Processing and Computer Applications (ICIPCA)10.1109/ICIPCA61593.2024.10709101(841-844)Online publication date: 28-Jun-2024
  • (2023)Benchmarking Hashing Algorithms for Load Balancing in a Distributed Database EnvironmentAdvances in Model and Data Engineering in the Digitalization Era10.1007/978-3-031-23119-3_8(105-118)Online publication date: 10-Jan-2023
  • (2021)LogECMemProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3458817.3480852(1-15)Online publication date: 14-Nov-2021
  • (2019)Coupling Decentralized Key-Value Stores with Erasure CodingProceedings of the ACM Symposium on Cloud Computing10.1145/3357223.3362713(377-389)Online publication date: 20-Nov-2019
  • (2018)Building efficient and available distributed transaction with Paxos-based coding consensusIEEE INFOCOM 2018 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFCOMW.2018.8406832(373-378)Online publication date: Apr-2018
  • (2018)Enhanced Self-Coding for Available Memcached2018 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Ubiquitous Computing & Communications, Big Data & Cloud Computing, Social Computing & Networking, Sustainable Computing & Communications (ISPA/IUCC/BDCloud/SocialCom/SustainCom)10.1109/BDCloud.2018.00106(700-707)Online publication date: Dec-2018
  • (2017)Erasure coding for small objects in in-memory KV storageProceedings of the 10th ACM International Systems and Storage Conference10.1145/3078468.3078470(1-12)Online publication date: 22-May-2017
  • (2016)Fast LH$$*$$źInternational Journal of Parallel Programming10.1007/s10766-015-0371-844:4(709-734)Online publication date: 1-Aug-2016
  • (2015)An Autonomous Data Structure for Brute Force Calculations in the CloudProceedings of the 2015 IEEE 7th International Conference on Cloud Computing Technology and Science (CloudCom)10.1109/CloudCom.2015.17(347-354)Online publication date: 30-Nov-2015
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media