[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
article
Free access

RemusDB: transparent high availability for database systems

Published: 01 February 2013 Publication History

Abstract

In this paper, we present a technique for building a high-availability (HA) database management system (DBMS). The proposed technique can be applied to any DBMS with little or no customization, and with reasonable performance overhead. Our approach is based on Remus, a commodity HA solution implemented in the virtualization layer, that uses asynchronous virtual machine state replication to provide transparent HA and failover capabilities. We show that while Remus and similar systems can protect a DBMS, database workloads incur a performance overhead of up to 32 % as compared to an unprotected DBMS. We identify the sources of this overhead and develop optimizations that mitigate the problems. We present an experimental evaluation using two popular database systems and industry standard benchmarks showing that for certain workloads, our optimized approach provides fast failover (≤ 3 s of downtime) with low performance overhead when compared to an unprotected DBMS. Our approach provides a practical means for existing, deployed database systems to be made more reliable with a minimum of risk, cost, and effort. Furthermore, this paper invites new discussion about whether the complexity of HA is best implemented within the DBMS, or as a service by the infrastructure below it.

References

[1]
Aboulnaga, A., Salem, K., Soror, A.A., Minhas, U.F., Kokosielis, P., Kamath, S.: Deploying database appliances in the cloud. IEEE Data Eng. Bull. 32(1), 13-20 (2009).
[2]
Altekar, G., Stoica, I.: ODR: output-deterministic replay for multicore debugging. In: Symposium on Operating Systems Principles (2009).
[3]
Baker, M., Sullivan, M.: The recovery box: using fast recovery to provide high availability in the UNIX environment. In: USENIX Summer Conference (1992).
[4]
Barham, P.T., Dragovic, B., Fraser, K., Hand, S., Harris, T.L., Ho, A., Neugebauer, R., Pratt, I., Warfield, A.: Xen and the art of virtualization. In: Symposium on Operating Systems Principles (SOSP) (2003).
[5]
Bressoud, T.C., Schneider, F.B.: Hypervisor-based fault-tolerance. In: Symposium on Operating Systems Principles (SOSP) (1995).
[6]
Chen, W.J., Otsuki, M., Descovich, P., Arumuggharaj, S., Kubo, T., Bi, Y.J.: High availability and disaster recovery options for DB2 on Linux, Unix, and Windows. Tech. Rep. IBM Redbook SG24-7363-01, IBM (2009).
[7]
Clark, C., Fraser, K., Hand, S., Hansen, J.G., Jul, E., Limpach, C., Pratt, I., Warfield, A.: Live migration of virtual machines. In: Symposium on Networked Systems Design and Implementation (NSDI) (2005).
[8]
Cully, B., Lefebvre, G., Meyer, D., Feeley, M., Hutchinson, N., Warfield, A.: Remus: High availability via asynchronous virtual machine replication. In: Symposium Networked Systems Design and Implementation (NSDI) (2008).
[9]
Distributed Replicated Block Device (DRBD): http://www.drbd. org/ (2008).
[10]
Dunlap, G.W., King, S.T., Cinar, S., Basrai, M.A., Chen, P.M.: ReVirt: enabling intrusion analysis through virtual-machine logging and replay. In: Symposium on Operating Systems Design and Implementation (OSDI) (2002).
[11]
Dunlap, G.W., Lucchetti, D.G., Fetterman, M.A., Chen, P.M.: Execution replay of multiprocessor virtual machines. In: Virtual Execution Environments (VEE) (2008).
[12]
Gifford, D.K.: Weighted voting for replicated data. In: Symposium on Operating Systems Principles (SOSP) (1979).
[13]
Gray, J., Helland, P., O'Neil, P., Shasha, D.: The dangers of replication and a solution. In: International Conference on Management of Data (SIGMOD) (1996).
[14]
Gray, J., Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan Kaufmann, Los Altos (1993).
[15]
Java TPC-W implementation, PHARM group, University of Wisconsin. http://www.ece.wisc.edu/pharm/tpcw/ (1999).
[16]
Kemme, B., Alonso, G.: Don't be lazy, be consistent: Postgres-R, a newway to implement database replication. In: International Conference on Very Large Data Bases (VLDB) (2000).
[17]
Komo, D.: Microsoft SQL Server 2008 R2 High Availability Technologies White Paper. Microsoft (2010).
[18]
Lee, D., Wester, B., Veeraraghavan, K., Narayanasamy, S., Chen, P.M., Flinn, J.: Respec: efficient online multiprocessor replayvia speculation and external determinism. In: International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS) (2010).
[19]
Linux-HA Project: http://www.linux-ha.org/doc/ (1999).
[20]
Llanos, D.R.: TPCC-UVa: an open-source TPC-C implementation for global performance measurement of computer systems. SIGMOD Rec. 35(4), 6-15 (2006).
[21]
Minhas, U.F., Rajagopalan, S., Cully, B., Aboulnaga, A., Salem, K., Warfield, A.: RemusDB: Transparent high availability for database systems. Proc. VLDB Endow. (PVLDB) 4(11), 738-748 (2011).
[22]
Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H., Schwarz, P.: ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. Trans. Database Syst. (TODS) 17(1), 94-162 (1992).
[23]
MySQL Cluster 7.0 and 7.1: Architecture and New Features. A MySQL Technical White Paper by Oracle (2010).
[24]
Oracle: Oracle Data Guard Concepts and Administration, 11g Release 1 edn (2008).
[25]
Oracle: Oracle Real Application Clusters 11g Release 2. Oracle (2009).
[26]
Oracle: MySQL 5.0 Reference Manual. Revision 23486, http://dev.mysql.com/doc/refman/5.0/en/ (2010).
[27]
Percona Tools TPC-C MySQL Benchmark: https://code.launchpad.net/percona-dev/perconatools/tpcc-mysql (2008).
[28]
Polyzois, C.A., Garcia-Molina, H.: Evaluation of remote backup algorithms for transaction processing systems. In: International Conference on Management of Data (SIGMOD) (1992).
[29]
Rajagopalan, S., Cully, B., O'Connor, R., Warfield, A.: SecondSite: disaster tolerance as a service. In: Virtual Execution Environments (VEE) (2012).
[30]
Scales, D.J., Nelson, M., Venkitachalam, G.: The design and evaluation of a practical system for fault-tolerant virtual machines. Tech. Rep. VMWare-RT-2010-001, VMWare (2010).
[31]
Soror, A.A., Minhas, U.F., Aboulnaga, A., Salem, K., Kokosielis, P., Kamath, S.: Automatic virtual machine configuration for database workloads. Trans. Database Syst. (TODS) 35(1), 1-47 (2010).
[32]
Strom, R., Yemini, S.: Optimistic recovery in distributed systems. Trans. Comput. Syst. (TOCS) 3(3), 204-226 (1985).
[33]
TCP/IP Tutorial and Technical Overview: http://www.redbooks. ibm.com/redbooks/pdfs/gg243376.pdf (2006).
[34]
Thomas, R.H.: Amajority consensus approach to concurrency control formultiple copy databases. Trans. Database Syst. (TODS) 4(2) (1979).
[35]
The TPC-C Benchmark: http://www.tpc.org/tpcc/ (1992).
[36]
The TPC-H Benchmark: http://www.tpc.org/tpch/ (1999).
[37]
The TPC-W Benchmark: http://www.tpc.org/tpcw/ (1999).
[38]
Xen Blktap2 Driver: http://wiki.xensource.com/xenwiki/blktap2 (2010).
[39]
Xu, M., Bodik, R., Hill, M.D.: A "flight data recorder" for enabling full-system multiprocessor deterministic replay. Comput. Archit. News 31(2), 122-135 (2003).

Cited By

View all

Index Terms

  1. RemusDB: transparent high availability for database systems

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image The VLDB Journal — The International Journal on Very Large Data Bases
      The VLDB Journal — The International Journal on Very Large Data Bases  Volume 22, Issue 1
      February 2013
      121 pages

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 01 February 2013

      Author Tags

      1. Checkpointing
      2. Fault tolerance
      3. High availability
      4. Performance modeling
      5. Virtualization

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)31
      • Downloads (Last 6 weeks)3
      Reflects downloads up to 01 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Live Migration of Video Analytics Applications in Edge ComputingIEEE Transactions on Mobile Computing10.1109/TMC.2023.324653923:3(2078-2092)Online publication date: 1-Mar-2024
      • (2022)C5Proceedings of the VLDB Endowment10.14778/3561261.356126216:1(1-14)Online publication date: 1-Sep-2022
      • (2021)The Aurora Single Level Store Operating SystemProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483563(788-803)Online publication date: 26-Oct-2021
      • (2021)The Aurora operating systemProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465285(136-143)Online publication date: 1-Jun-2021
      • (2019)Research challenges in query processing and data analytics on the edgeProceedings of the 29th Annual International Conference on Computer Science and Software Engineering10.5555/3370272.3370308(317-322)Online publication date: 4-Nov-2019
      • (2019)Rethinking database high availability with RDMA networksProceedings of the VLDB Endowment10.14778/3342263.334263912:11(1637-1650)Online publication date: 1-Jul-2019
      • (2019)PhantasyIEEE Transactions on Computers10.1109/TC.2018.286594368:2(225-238)Online publication date: 1-Feb-2019
      • (2018)Query freshProceedings of the VLDB Endowment10.1145/3164135.316413711:4(406-419)Online publication date: 5-Oct-2018
      • (2017)Query freshProceedings of the VLDB Endowment10.1145/3186728.316413711:4(406-419)Online publication date: 1-Dec-2017
      • (2017)GinjaProceedings of the 18th ACM/IFIP/USENIX Middleware Conference10.1145/3135974.3135985(248-260)Online publication date: 11-Dec-2017
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media