[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3416921.3416925acmotherconferencesArticle/Chapter ViewAbstractPublication PagesiccbdcConference Proceedingsconference-collections
research-article

A Fast Adaptive Replica Recovery Algorithm Based on Access Frequency and Environment Awareness

Published: 24 September 2020 Publication History

Abstract

As cloud storage adopts a distributed architecture to store massive data, how to improve the reliability of the storage center has become the focus of researchers. HDFS, the distributed file system of Hadoop, uses a sequential recovery method to recover failed replicas when the node is down, which does not take into account the priority of the replicas and the load difference between each node, resulting in request blocking and system load imbalance. Aiming at this problem, we propose a replica recovery method based on access frequency and network environment awareness, use a priority-based recovery algorithm and multi-objective decision algorithm to guarantee the response speed and realize the cluster load balance. We set up a simulation environment for verification and compared the performance of our method with various algorithms in throughput and response time. The simulation results showed that our approach effectively solved the load imbalance problem on the premise of ensuring faster response time.

References

[1]
Ghemawat S, Gobioff H, Leung ST. The Google file system. Proc. of 19th ACM Symposium on Operating Systems Principles (SOSP 2003). New York, USA. October, 2003. 29--43.
[2]
Tong R, Zhu X.A Load Balancing Strategy Based on the Combination of Static and Dynamic[C].Proceedings of Intemational Workshop on Database Technology and Applications(DBTA), 20 1 0:1--4.
[3]
Gkantsidis C, Rodriguez P R.Network coding for large scale content distribution[C].Proceedings of Annual Joint Conference of the IEEE Computer and Communications Societies, 2005:2235--2245.
[4]
Sage A. Weil Scott A. Brandt Ethan L. Miller Darrell D. E. Long, Ceph: A Scalable, High-Performance Distributed File System, Proceeding of 7th conference on operating system design and implementation (OSDI'06), November, 2006.
[5]
Frank Schmuck and Roger Haskin. Gpfs: A shared-disk file system for large computing clusters. Proceedings of the 2002 Conference on File and Storage Technologies (FAST), pages 231--244, 2002.
[6]
Krishnamurthy B, Wills C, Zhang Y. On the use and performance of content distribution networks[C].Proceedings of ACM SIGCOMM Workshop on Intemet Measurement table of contents, 2004:169--182.
[7]
Eltabakh M Y, Tian Y, Zcan F, et al. CoHadoop: flexible data placement and its exploitation in Hadoop[J]. Proceedings of the Vldb Endowment, 2011, 4(9):575--585.
[8]
Malkhi D, Novik L, Purcell C.P2P replica synchronization with vector sets[J]. ACM SIGOPS Operating Systems Review, 2007, 41(2):68--74.
[9]
W. F. Wang, W. H. Wei," A "DynamicReplicaPlacementMechanism Based-on Response Time Measure," Proc. of IEEE International Conf. on Communications and Mobile Computing, pp.169--173, 2010.
[10]
Al-Jaroodi J, Mohamed N, Nuaimi KA. An efficient fault-tolerant algorithm for distributed cloud services. Proc. IEEE Symp. Netw. Cloud Comput. Appl. NCCA, London, United Kingdom. 2012. 1--8.
[11]
Khaneghah E M, Mirtaheri S L, Grandinetti L, et al. A Dynamic Replication Mechanism to Reduce Response- time of I / O Operations in High Performance Computing Clusters[C]//Proceedings of International Conference on Social Computing. Berlin, Germany: Springer, 2013: 110--130.
[12]
Anderson E. Capture, conversion, and analysis of an intense NFS workload[C]. Proceedings of Conference on File and St orage Technologies, 2009: 139--152.
[13]
Wei QS, Veeravalli B, Gong BZ, et al. CDRM: a cost-effective dynamic replication management scheme for cloud storage cluster. Proc. IEEE International Conference on Cluster Computing. 2010. 188--196.
[14]
Higai A, Takefusa A, Nakada H, et al. A Study of Effective Replica Reconstruction Schemes at Node Deletion for HDFS.[C]// IEEE/ACM International Symposium on Cluster. IEEE, 2014.

Cited By

View all
  • (2023)HV-SNSP: A Low-Overhead Data Recovery Method Based on Cross-CheckingIEEE Access10.1109/ACCESS.2023.323578711(5737-5745)Online publication date: 2023
  • (2022)H-V: An Improved Coding Layout Based on Erasure Coded Storage SystemDatabase Systems for Advanced Applications. DASFAA 2022 International Workshops10.1007/978-3-031-11217-1_15(203-213)Online publication date: 16-Jul-2022

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICCBDC '20: Proceedings of the 2020 4th International Conference on Cloud and Big Data Computing
August 2020
130 pages
ISBN:9781450375382
DOI:10.1145/3416921
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

  • Brookes: Oxford Brookes University
  • Staffordshire University: Staffordshire University
  • University of Liverpool

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 September 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HDFS
  2. replica recovery
  3. storage reliability

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICCBDC '20

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 02 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2023)HV-SNSP: A Low-Overhead Data Recovery Method Based on Cross-CheckingIEEE Access10.1109/ACCESS.2023.323578711(5737-5745)Online publication date: 2023
  • (2022)H-V: An Improved Coding Layout Based on Erasure Coded Storage SystemDatabase Systems for Advanced Applications. DASFAA 2022 International Workshops10.1007/978-3-031-11217-1_15(203-213)Online publication date: 16-Jul-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media