Optimization of file migration in distributed systems

January 1988

Author:
Oivind Kure

Publisher:

University of California, Berkeley

Order Number:AAI8902170

Pages:

278

Purchase on ProQuest

Bibliometrics

Abstract

This dissertation presents and evaluates several new algorithms that improve the performance of distributed file systems by migrating or copying files as necessary between system nodes. The results are based on analysis of file reference patterns from three commercial installations, modeling, and trace driven simulation.

The first part of the dissertation is an exploratory analysis of how shared user files are referenced. We find that although few files are shared, they are opened frequently, and account for a large fraction of the I/O traffic to user files. The reference pattern to shared files is not easily characterized, and varies widely among files. A batch Poisson process with geometric batch size is determined to be the most appropriate model.

Based on the exploratory analysis, we developed several algorithms for file migration and replication. The algorithms evaluated include those based on our file reference pattern analysis, as well as simple strategies such as static placement and movement on reference, and optimal look-ahead migration and placement. We found that only a few files should be migrated or replicated, but replication or migration can substantially reduce the network traffic (up to 63% for replication and 36% for migration, relative to static placement).

A policy based on a batch Poisson process with geometric batch size has the best performance when replication is not allowed. It uses as decision variables the fraction of a file accessed per open, the number of references from a user, and the number of changes in locality.

By replicating files, the network traffic can be reduced further compared to migration alone (up to 42%). Whether the additional copies should be invalidated or updated when the file is updated depends on the installation and the rules for placing users at nodes. The algorithms with the best performance use the average reference rate, the number of consecutive opens in update mode, and the time since the node started using the file as the decision variables. By comparing our realizable algorithms with optimal unrealizable algorithms, we show that it is unlikely that other migration or replication algorithms can achieve a substantially better performance.

Cited By

Contributors

Øivind Kure
University of Oslo
- Publication Years1988 - 2022
- Publication counts27
- Citation count42
- Available for Download3
- Downloads (cumulative)885
- Downloads (12 months)8
- Downloads (6 weeks)1
- Average Downloads per Article295
- Average Citation per Article2
View Full Profile

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Recommendations

Optimization of File Migration In Distributed Systems
File Migration and File Replication: A Symbiotic Relationship

Much of the past research on file migration and file replication has examined these two resource management strategies in isolation or in an environment where they do not work together. We establish through simulation that these two strategies can be ...
A Distributed Algorithm for Performance Improvement Through File Replication, File Migration, and Process Migration

The author presents a distributed algorithm that considers the number of read and write accesses to files for every process type, the number of processes and their demands on system resources, the utilization of bottlenecks on all machines, and file ...

Browse Theses

Sections

Cited By