[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2342821.2342823guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Erasure coding in windows azure storage

Published: 13 June 2012 Publication History

Abstract

Windows Azure Storage (WAS) is a cloud storage system that provides customers the ability to store seemingly limitless amounts of data for any duration of time. WAS customers have access to their data from anywhere, at any time, and only pay for what they use and store. To provide durability for that data and to keep the cost of storage low, WAS uses erasure coding.
In this paper we introduce a new set of codes for erasure coding called Local Reconstruction Codes (LRC). LRC reduces the number of erasure coding fragments that need to be read when reconstructing data fragments that are offline, while still keeping the storage overhead low. The important benefits of LRC are that it reduces the bandwidth and I/Os required for repair reads over prior codes, while still allowing a significant reduction in storage overhead. We describe how LRC is used in WAS to provide low overhead durable storage with consistently low read latencies.

References

[1]
B. Calder et al., "Windows Azure Storage: A Highly Available Cloud Storage Service with Strong Consistency," ACM SOSP, 2011.
[2]
D. T. Meyer et al., "Fast and Cautious Evolution of Cloud Storage," HotStorage, 2010.
[3]
J. Kubiatowicz et al., "OceanStore: An Architecture for Global-Scale Persistent Storage," ACM ASPLOS, Nov. 2000.
[4]
A. Haeberlen, A. Mislove, and P. Druschel, "Glacier: Highly durable, decentralized storage despite massive correlated failures," USENIX NSDI, 2005.
[5]
M. Abd-El-Malek et al., "Ursa Minor: Versatile Cluster-based Storage," USENIX FAST, 2005.
[6]
C. Ungureanu et al., "HydraFS: A High-Throughput File System for the HYDRAstor Content-Addressable Storage System," USENIX FAST, 2010.
[7]
H. Weatherspoon, and J. Kubiatowicz, "Erasure coding vs. replication: A quantitative comparison," In Proc. IPTPS, 2001.
[8]
D. Borthakur et al., "HDFS RAID," Hadoop User Group Meeting, Nov. 2010.
[9]
A. Fikes, "Storage Architecture and Challenges," Google Faculty Summit, 2010.
[10]
D. Ford et al., "Availability in Globally Distributed Storage Systems,", USENIX OSDI, 2010.
[11]
L. Tian et al., "PRO: A Popularity-based Multi-threaded Reconstruction Optimization for RAID-Structured Storage Systems," USENIX FAST, 2007
[12]
F. J. MacWilliams and N. J. A. Sloane, "The Theory of Error Correcting Codes," Amsterdam: North-Holland, 1977.
[13]
I. S. Reed and G. Solomon, "Polynomial Codes over Certain Finite Fields", J. SIAM, 8(10), 300-304, 1960.
[14]
C. Huang, M. Chen, and J. Li, "Pyramid Codes: Flexible Schemes to Trade Space for Access Efficiency in Reliable Data Storage Systems," Proc. of IEEE NCA, Cambridge, MA, Jul. 2007.
[15]
R. G. Gallager, "Low-Density Parity-Check Codes," MIT Press, Cambridge, MA, 1963.
[16]
M. G. Luby et al., "Efficient Erasure Correcting Codes," IEEE Transactions on Information Theory, 2011.
[17]
J. S. Plank, R. L. Collins, A. L. Buchsbaum, and M. G. Thomason, "Small Parity-Check Erasure Codes - Exploration and Observations," Proc. DSN, 2005.
[18]
J. L. Hafner, "Weaver codes: Highly fault tolerant erasure codes for storage systems," USENIX FAST, 2005.
[19]
J. L. Hafner, "HoVer Erasure Codes for Disk Arrays," Proc. of DSN, 2006.
[20]
K. M. Greenan, X. Li, and J. J. Wylie, "Flat XOR-based erasure codes in storage systems: Constructions, efficient recovery, and tradeoffs," IEEE Mass Storage Systems and Technologies, 2010.
[21]
P. Gopalan, C. Huang, H. Simitci, and S. Yekhanin, "On the Locality of Codeword Symbols," Allerton, 2011.
[22]
J. Blomer et al., "An XOR-Based Erasure-Resilient Coding Scheme," Technical Report No. TR-95-048, ICSI, Berkeley, California, Aug. 1995.
[23]
J. Luo, L. Xu, and J. S. Plank, "An Efficient XOR-Scheduling Algorithm for Erasure Codes Encoding," Proc. DSN, Lisbon, Portugal, June, 2009.
[24]
A. G. Dimakis et al., "Network Coding for Distributed Storage Systems," IEEE Transactions on Information Theory, Vol. 56, Issue 9, Sept. 2010.
[25]
L. Xiang et al., "Optimal recovery of single disk failure in RDP code storage systems," ACM SIGMETRICS, 2010.
[26]
O. Khan et al., "Rethinking Erasure Codes for Cloud File Systems: Minimizing I/O for Recovery and Degraded Reads," USENIX FAST, San Jose, Feb. 2012.
[27]
Y. Hu et al., "NCCloud: Applying Network Coding for the Storage Repair in a Cloud-of-Clouds," USENIX FAST, San Jose, 2012.
[28]
J. H. Howard et al., "Scale and Performance in a Distributed File System," ACM ToCS, Feb. 1988.
[29]
M. Satyanarayanan et al., "CODA: A Highly Available File System for a Distributed Workstation Environment," IEEE Transactions on Computers, Apr. 1990.
[30]
B. Liskov et al., "Replication in the Harp File System," ACM SOSP, Pacific Grove, CA, Oct. 1991.
[31]
F. Dabek et al., "Wide-Area Cooperative Storage with CFS," ACM SOSP, 2001.
[32]
B. G. Chun et al., "Efficient Replica Maintenance for Distributed Storage Systems," USENIX NSDI, 2006.
[33]
Q. Xin et al., "Reliability mechanisms for very large storage systems," Proc. of IEEE Conference on Mass Storage Systems and Technologies, 2003.
[34]
S. Nath et al., "Subtleties in Tolerating Correlated Failures in Wide-Area Storage Systems," USENIX NSDI, 2006.
[35]
A. Krioukov et al., "Parity Lost and Parity Regained," USENIX FAST, Feb. 2008.
[36]
L. N. Bairavasundaram et al., "An Analysis of Data Corruption in the Storage Stack," USENIX FAST, Feb. 2008.
[37]
L. Lamport, "The Part-Time Parliament," ACM Transactions on Computer Systems, vol. 16, no. 2, pp. 133-169, May 1998.
[38]
J. K. Resch, and J. S. Plank, "AONT-RS: Blending Security and Performance in Dispersed Storage Systems," USENIX FAST, Feb. 2011.

Cited By

View all
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)Morph: Efficient File-Lifetime Redundancy Management for Cluster File SystemsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695981(330-346)Online publication date: 4-Nov-2024
  • (2024)Revisiting Erasure Codes: A Configuration PerspectiveProceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665951(93-100)Online publication date: 8-Jul-2024
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
USENIX ATC'12: Proceedings of the 2012 USENIX conference on Annual Technical Conference
June 2012
41 pages

Publisher

USENIX Association

United States

Publication History

Published: 13 June 2012

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ELECTProceedings of the 22nd USENIX Conference on File and Storage Technologies10.5555/3650697.3650715(293-310)Online publication date: 27-Feb-2024
  • (2024)Morph: Efficient File-Lifetime Redundancy Management for Cluster File SystemsProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695981(330-346)Online publication date: 4-Nov-2024
  • (2024)Revisiting Erasure Codes: A Configuration PerspectiveProceedings of the 16th ACM Workshop on Hot Topics in Storage and File Systems10.1145/3655038.3665951(93-100)Online publication date: 8-Jul-2024
  • (2024)CoRD: Combining Raid and Delta for Fast Partial Updates in Erasure-Coded Storage ClustersProceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis10.1109/SC41406.2024.00113(1-14)Online publication date: 17-Nov-2024
  • (2023)ParaRCProceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585940(17-31)Online publication date: 21-Feb-2023
  • (2023)Practical design considerations for wide locally recoverable codes (LRCs)Proceedings of the 21st USENIX Conference on File and Storage Technologies10.5555/3585938.3585939(1-16)Online publication date: 21-Feb-2023
  • (2023)gPPM: A Generalized Matrix Operation and Parallel Algorithm to Accelerate the Encoding/Decoding Process of Erasure CodesACM Transactions on Architecture and Code Optimization10.1145/362500520:4(1-25)Online publication date: 21-Sep-2023
  • (2023)Design Considerations and Analysis of Multi-Level Erasure Coding in Large-Scale Data CentersProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607072(1-13)Online publication date: 12-Nov-2023
  • (2022)Lazy repair with temporary redundancy(LRTR)Proceedings of the 19th ACM International Conference on Computing Frontiers10.1145/3528416.3530240(85-93)Online publication date: 17-May-2022
  • (2022)RAIL: Predictable, Low Tail Latency for NVMe FlashACM Transactions on Storage10.1145/346540618:1(1-21)Online publication date: 29-Jan-2022
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media