More Web Proxy on the site http://driver.im/

Article

Surviving congestion in geo-distributed storage systems

Authors:

Marcos K. AguileraAuthors Info & Claims

USENIX ATC'12: Proceedings of the 2012 USENIX conference on Annual Technical Conference

Page 40

Published: 13 June 2012 Publication History

Abstract

We present Vivace, a key-value storage system for web applications that span many geographically-distributed sites. Vivace provides strong consistency and replicates data across sites for access locality and disaster tolerance. Vivace is designed to cope well with network congestion across sites, which occurs because the bandwidth across sites is smaller than within sites. To deal with congestion, Vivace relies on two novel algorithms that prioritize a small amount of critical data to avoid delays due to congestion. We evaluate Vivace to show its feasibility and effectiveness.

References

[1]

http://www.cisco.com/en/US/prod/collateral/ iosswrel/ps6537/ps6557/prod_white_ paper0900aecd803e55d7.pdf as of Oct 2011.

[2]

https://github.com/twissandra/twissandra, as of Oct 2011.

[3]

http://cassandra.apache.org, as of Oct 2011.

[4]

Dante - proxy communication solution. http://www.inet.no/dante/.

[5]

Global MPLS VPN pricing guide. http://shop2.sprint.com/assets/pdfs/en/ solutions/worldwide/taiwan_global_mpls_vpn.pdf as of Oct 2011.

[6]

The Hadoop distributed file system: Architecture and design. http://hadoop.apache.org/core/docs/ current/hdfs_design.html.

[7]

HTB Linux queuing discipline manual--user guide. http://luxik.cdi.cz/~devik/qos/htb/manual/userg.htm.

[8]

Catalyst 3550 Multilayer Switch Software Configuration Guide, Cisco IOS Release 12.1(13)EA1. Mar. 2003.

[9]

M. Abd-El-Malek, G. R. Ganger, G. R. Goodson, M. K. Reiter, and J. J. Wylie. Fault-scalable Byzantine fault-tolerant services. In SOSP, Oct. 2005.

[10]

M. K. Aguilera, C. Delporte-gallet, H. Fauconnier, and S. Toueg. Thrifty generic broadcast. In DISC, Oct. 2000.

[11]

M. K. Aguilera, S. Frolund, V. Hadzilacos, S. L. Horn, and S. Toueg. Abortable and query-abortable objects and their efficient implementation. In PODC, 2007.

[12]

M. K. Aguilera, W. Golab, and M. A. Shah. A practical scalable distributed B-tree. VLDB, 1(1), Aug. 2008.

[13]

M. K. Aguilera, I. Keidar, D. Malkhi, and A. Shraer. Dynamic atomic storage without consensus. In PODC, Aug. 2009.

[14]

M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. TOCS, 27(3), Nov. 2009.

[15]

Y. Amir, C. Danilov, J. Kirsch, J. Lane, D. Dolev, C. Nita-Rotaru, J. Olsen, and D. Zage. Scaling Byzantine fault-tolerant replication to wide area networks. In DSN, 2006.

[16]

H. Attiya, A. Bar-Noy, and D. Dolev. Sharing memory robustly in message-passing systems. JACM, 42(1), Jan. 1995.

[17]

J. Baker et al. Megastore: Providing scalable, highly available storage for interactive services. In Conference on Innovative Data Systems Research, Jan. 2011.

[18]

N. Belaramani, M. Dahlin, L. Gao, A. Nayate, A. Venkataramani, P. Yalagandula, and J. Zheng. PRACTI replication. In NSDI, 2006. Extended version available at http://www.cs.utexas.edu/users/dahlin/papers/ PRACTI-2005-10-extended.pdf.

[19]

M. Castro and B. Liskov. Practical Byzantine fault tolerance. In OSDI, Feb. 1999.

[20]

F. Chang et al. Bigtable: A distributed storage system for structured data. In OSDI, Nov. 2006.

[21]

G. Chockler and D. Malkhi. Active disk paxos with infinitely many processes. Distributed Computing, 18(1), July 2005.

[22]

A. Clement, E. Wong, L. Alvisi, M. Dahlin, and M. Marchetti. Making Byzantine fault tolerant systems tolerate Byzantine faults. In NSDI, Apr. 2009.

[23]

B. F. Cooper et al. PNUTS: Yahoo!'s hosted data serving platform. In VLDB, Aug. 2008.

[24]

G. DeCandia et al. Dynamo: Amazon's Highly Available Key-value Store. In SOSP, 2007.

[25]

D. Dobre, M. Majuntke, M. Serafini, and N. Suri. HP: Hybrid paxos for WANs. In European Dependable Computing Conference, Apr. 2010.

[26]

J. R. Douceur and J. Howell. Distributed directory service in the Farsite file system. In OSDI, Nov. 2006.

[27]

E. Gafni and L. Lamport. Disk paxos. Distributed Computing, 16(1), Feb. 2003.

[28]

S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In SOSP, Oct. 2003.

[29]

S. Gilbert, N. Lynch, and A. Shvartsman. RAMBO II: Rapidly reconfigurable atomic memory for dynamic networks. In DSN, June 2003.

[30]

S. Gilbert, N. A. Lynch, and A. A. Shvartsman. RAMBO: A robust, reconfigurable atomic memory service for dynamic networks. Distributed Computing, 23(4), Dec. 2010.

[31]

S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler. Scalable distributed data structures for internet service construction. In OSDI, 2000.

[32]

J. Hendricks. Efficient Byzantine fault tolerance for scalable storage and services. Technical Report CMU-CS-09-146, Carnegie Mellon University, School of Computer Science, July 2009.

[33]

M. P. Herlihy and J. M. Wing. Acm toplas. ACM Trans. Program. Lang. Syst., 12, July 1990.

[34]

J. H. Howard et al. Scale and performance in a distributed file system. TOCS, 6(1), Feb. 1988.

[35]

R. Kotla, L. Alvisi, M. Dahlin, A. Clement, and E. Wong. Zyzzyva: speculative Byzantine fault tolerance. In SOSP, Oct. 2007.

[36]

T. Kraska, G. Pang, M. J. Franklin, and S. Madden. MDCC: Multi-Data Center Consistency. 1203.6049, Mar. 2012.

[37]

L. Lamport. The part-time parliament. TOCS, 16(2), May 1998.

[38]

L. Lamport. Generalized consensus and Paxos. Technical Report MSR-TR-2005-33, Microsoft Research, Mar. 2005.

[39]

E. K. Lee and C. A. Thekkath. Petal: Distributed virtual disks. In ASPLOS, Oct. 1996.

[40]

W. Lloyd, M. Freedman, M. Kaminsky, and D. Andersen. Don't settle for eventual: Stronger consistency for wide-area storage with COPS. In SOSP, Oct. 2011.

[41]

P. Mahajan et al. Depot: Cloud storage with minimal trust. In OSDI, 2010.

[42]

Y. Mao, F. P. Junqueira, and K. Marzullo. Mencius: building efficient replicated state machines for wans. In OSDI, 2008.

[43]

J.-P. Martin and L. Alvisi. A framework for dynamic Byzantine storage. In DSN, June 2004.

[44]

L. B. Mummert, M. R. Eblig, and M. Satyanarayanan. Exploiting weak connectivity for mobile file access. In SOSP, Dec. 1995.

[45]

J. Nielsen. Designing Web Usability: The Practice of Simplicity. New Riders Publishing, 1999.

[46]

F. Pedone and A. Schiper. Handling message semantics with generic broadcast protocols. Distributed Computing, 15(2), Apr. 2002.

[47]

R. Rodrigues and B. Liskov. Rosebud: A scalable Byzantine-fault-tolerant storage architecture. Technical Report TR/932, MIT LCS, Dec. 2003.

[48]

Y. Saito, S. Frolund, A. Veitch, A. Merchant, and S. Spence. FAB: building distributed enterprise disk arrays from commodity components. In ASPLOS, Oct. 2004.

[49]

Y. Saito, C. Karamanolis, M. Karlsson, and M. Mhalingam. Taming aggressive replication in the Pangaea wide-area file system. In OSDI, Dec. 2002.

[50]

F. B. Schneider. Implementing fault-tolerant services using the state machine approach : A tutorial. ACM Computing Surveys, 22(4), Dec. 1990.

[51]

D. B. Terry et al. Managing update conflicts in Bayou, a weakly connected replicated storage system. In SOSP, Dec. 1995.

[52]

C. A. Thekkath, T. Mann, and E. K. Lee. Frangipani: A scalable distributed file system. In SOSP, Oct. 1997.

[53]

A. Venkataramani, R. Kokku, and M. Dahlin. TCP nice: A mechanism for background transfers. In OSDI, Dec. 2002.

[54]

R. Y. Wang and T. E. Anderson. xFS: A wide area mass storage file system. In Workshop on Workstation Operating Systems, Oct. 1993.

[55]

S. Weil, S. A. Brandt, E. L. Miller, D. D. E. Long, and C. Maltzahn. Ceph: A scalable, high-performance distributed file system. In OSDI, Nov. 2006.

Cited By

Zhang BJin XRatnasamy SWawrzynek JLee EGorinsky STapolcai J(2018)AWStreamProceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication10.1145/3230543.3230554(236-252)Online publication date: 7-Aug-2018
https://dl.acm.org/doi/10.1145/3230543.3230554
Hu YLi XZhang MLee PZhang XZhou PFeng D(2017)Optimal Repair Layering for Erasure-Coded Data CentersACM Transactions on Storage10.1145/314934913:4(1-24)Online publication date: 14-Nov-2017
https://dl.acm.org/doi/10.1145/3149349
Netto HLung LRibeiro TCorreia MLuiz A(2015)Anticipating Requests to Improve Performance and Reduce Costs in Cloud StorageACM SIGMETRICS Performance Evaluation Review10.1145/2847220.284722643:3(21-24)Online publication date: 19-Nov-2015
https://dl.acm.org/doi/10.1145/2847220.2847226
Show More Cited By

Surviving congestion in geo-distributed storage systems

Recommendations

Replication of Metadata in Distributed Storage Systems: Asynchronous Replication Across Multi-Master Servers
Availability-Based Methods for Distributed Storage Systems
SRDS '12: Proceedings of the 2012 IEEE 31st Symposium on Reliable Distributed Systems

Distributed storage systems rely heavily on redundancy to ensure data availability as well as durability. In networked systems subject to intermittent node unavailability, the level of redundancy introduced in the system should be minimized and ...
TCP and explicit congestion notification

This paper discusses the use of Explicit Congestion Notification (ECN) mechanisms in the TCP/IP protocol. The first part proposes new guidelines for TCP's response to ECN mechanisms (e.g., Source Quench packets, ECN fields in packet headers). Next, ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

USENIX ATC'12: Proceedings of the 2012 USENIX conference on Annual Technical Conference

June 2012

41 pages

Publisher

USENIX Association

United States

Publication History

Published: 13 June 2012

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 19 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang BJin XRatnasamy SWawrzynek JLee EGorinsky STapolcai J(2018)AWStreamProceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication10.1145/3230543.3230554(236-252)Online publication date: 7-Aug-2018
https://dl.acm.org/doi/10.1145/3230543.3230554
Hu YLi XZhang MLee PZhang XZhou PFeng D(2017)Optimal Repair Layering for Erasure-Coded Data CentersACM Transactions on Storage10.1145/314934913:4(1-24)Online publication date: 14-Nov-2017
https://dl.acm.org/doi/10.1145/3149349
Netto HLung LRibeiro TCorreia MLuiz A(2015)Anticipating Requests to Improve Performance and Reduce Costs in Cloud StorageACM SIGMETRICS Performance Evaluation Review10.1145/2847220.284722643:3(21-24)Online publication date: 19-Nov-2015
https://dl.acm.org/doi/10.1145/2847220.2847226
Dobre DKarame GLi WMajuntke MSuri NVukolić MSadeghi AGligor VYung M(2013)PoWerStoreProceedings of the 2013 ACM SIGSAC conference on Computer & communications security10.1145/2508859.2516750(285-298)Online publication date: 4-Nov-2013
https://dl.acm.org/doi/10.1145/2508859.2516750

View Options

View options

Media

Figures

Other

Tables

View Table of Contents