In this study, we examine various metadata server layouts for distributed data storage systems in terms of high availability, scalability and performance. A central metadata server is a single point of failure leading to low availability. Ensuring high availability requires replication of metadata servers. Synchronously replicated metadata servers layout introduces synchronization overhead which degrades the performance of data operations. We propose an asynchronously replicated multi-master metadata servers layout which ensures high availability, scalability and provides better performance. We discuss the implications of asynchronously replicated multi-master metadata servers on metadata consistency and conict resolution. Furthermore, we introduce an asynchronous multi-master replication tool that is deployed in the state-wide distributed data storage system called PetaShare, and compare performance of all three metadata server layouts: central metadata server, synchronously replicated multi-master metadata servers and asynchronously replicated multi-master meta- data servers.
Recommendations
Coding-Based Replication Schemes for Distributed Systems
Data is often replicated in distributed systems to improve availability and performance. This replication is expensive in terms of disk storage since the existing schemes generally require full files to be stored at each site. In this paper, we present ...