[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/822086.823346guideproceedingsArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
Article

Decoupling Computation and Data Scheduling in Distributed Data-Intensive Applications

Published: 24 July 2002 Publication History

Abstract

In high energy physics, bioinformatics, and other disciplines, we encounter applications involving numerous, loosely coupled jobs that both access and generate large data sets. So-called Data Grids seek to harness geographically distributed resources for such large-scale data-intensive problems. Yet effective scheduling in such environments is challenging, due toa need to address a variety of metrics and constraints (e.g., resource utilization, response time, global and local allocation policies) while dealing with multiple, potentially independent sources of jobs and a large number of storage, compute, and network resources.We describe a scheduling framework that addresses these problems. Within this framework, data movement operations may be either tightly bound to job scheduling decisions or, alternatively, performed by a decoupled, asynchronous process on the basis of observed data access patterns and load. We develop a family of job scheduling and data movement(replication) algorithms and use simulation studies to evaluate various combinations. Our results suggest that while it is necessary to consider the impact of replication on the scheduling strategy, it is not always necessary to couple data movement and computationscheduling. Instead, these two activities can be addressed separately, thus significantly simplifying the design and implementation of the overall Data Grid system.

References

[1]
CMS: Compact Muon Solenoid: http://cmsinfo.cern.ch/Weicome.html/
[2]
Fermi National Accelerator Laboratory: http://www.fnal.gov
[3]
Parsec: Parallel Simulation Environment for Complex Systems: http://pcl.cs.ucla.edu/projects/parsec
[4]
Proceedings of Job Scheduling Strategies for Parallel Processing Workshop: http://www.link.springer.de/link/service/series/0558/tocs/t 2221.htm
[5]
Alhusaini, A.H., Prasanna, V.K. and Raghavendra, C.S., A Unified Resource Scheduling Framework for Heterogeneous Computing Environments. in 8th Heterogeneous Computing Workshop, (1999).
[6]
Avery, P. and Foster, I. The GriPhyN Project: Towards Petascale Virtual Data Grids, 2001.
[7]
Avery, P., Foster, I., Gardner, R., Newman, H. and Szalay, A. An International Virtual-Data Grid Laboratory for Data Intensive Science, 2001.
[8]
Basney, J., Livny, M. and Mazzanti, P., Harnessing the Capacity of Computational Grids for High Energy Physics. in Computing in High Energy and Nuclear Physics, (2000).
[9]
Basney, J., Livny, M. and Mazzanti, P. Utilizing Widely Distributed Computational Resources Efficiently with Execution Domains. Computer Physics Communications.
[10]
Berman, F., Wolski, R., Figueira, S., Schopf, J. and Shao, G., Application-Level Scheduling on Distributed Heterogeneous Networks. in Supercomputing '96, (Pittsburg, 1996).
[11]
Bestavros, A., Demand-based document dissemination to reduce traffic and balance load in distributed information systems. in IEEE symposium on Parallel and Distributed Processing, (San Antonio, TX, 1995), 338-345.
[12]
Bester, J., Foster, I., Kesselman, C, Tedesco, J. and Tuecke, S., GASS: A data movement and access service for wide area computing systems. in Sixth Workshop on Input/Output in Parallel and Distributed Systems, (1999).
[13]
Braun, T., A Taxonomy of scheduling in general-purpose distributed computing systems. in Workshop on Advances in Parallel and Distributed Systems (APADS), (West Lafayette, IN, 1998).
[14]
Casanova, H., OberteJli, G., Berman, F. and Wolski, R., The AppLeS Parameter Sweep Template: User-Level Middleware for the Grid. in Super Computing, (Denver, 2000).
[15]
Czajkowski, K., Fitzzgerald, S., Foster, I. and Kesselman, C., Grid Information Services for Distributed Resource Sharing. in Tenth IEEE International Symposium on High Performance Distributed Computing(HPDC-10), (2001).
[16]
Fan, L., Cao, P., Almeida, J. and Broder, A., Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. in Proceedings of ACM SIGCOMM'98, (Vancouver, Canada, 1998).
[17]
Foster, I. and Kesselman, C. Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputing Applications, 11 (2). 115-128.
[18]
Foster, I. and Kesselman, C. (eds.). The Grid: Blueprint for a New Computing Infrastructure. Morgan Kaufmann, 1999.
[19]
Foster, I., Kesselman, C and Tuecke, S. The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications, 15 (3). 200-222.
[20]
Hamscher, V., Schwiegelshohn, U., Streit, A. and Yahyapour, R., Evaluation of lob-Scheduling Strategies for Grid Computing. in 7th International Conference of High Performance Computing, (Bangalore, India, 2000).
[21]
Holtman, K., CMS Requirements for the Grid. in CHEP, (Beijing, 2001).
[22]
M. Maheswaran, S. Ali, H.J. Siegel and D. Hensgen, Dynamic Matching and Scheduling of a Class of Independent Tasks onto Heterogeneous Computing Systems. in 8th Heterogeneous Computing Workshop, (1999).
[23]
Ranganathan, K. and Foster, I., Identifying Dynamic Replication Strategies for a High Performance Data Grid. in International Workshop on Grid Computing, (Denver, CO, 2001), Springer-Verlag.
[24]
Sib, G.C. and Lee, E.A., Dynamic-level scheduling for heterogeneous processor networks. in Second IEEE Symposium on Parallel and Distributed Systems, (1990).
[25]
Thain, D., Basney, J., Son, S.-C and Livny, M., The Kangaroo approach to data movement on the grid. in Tenth IEEE Symposium on High Performance Distributed Computing, (San Francisco, 2001).
[26]
Thain, D., Bent, J., Arpaci-Dusseau, A., Arpaci-Dusseau, R. and Livny, M., Gathering at the Well: Creating Communities for Grid I/O. in Supercomputing, (Denver, CO, 2001).
[27]
Wolman, A., Voelker, G.M., Sharma, N., Cardwell, N., Karlin, A. and Levy, H.M., On the scale and performance of cooperative Web proxy caching. in Proceedings of 17th ACM Symposium on Operating Systems Principles (SOPS'99), (Kiawah Island Resort, SC, USA, 1999), 16- 31.
[28]
Wolski, R. Forecasting Network Performance to Support Dynamic Scheduling Using the Network Weather Service. in Proc. 6th IEEE Symp. on High Performance Distributed Computing, Portland, Oregon, 1997.
[29]
Yu, P.S. and MacNair, E.A., Performance study of a collaborative method for hierarchical caching in proxy servers. in Proceedings of 7th International World Wide Web Conference (WWW7), (1998).

Cited By

View all
  • (2024)The Cost of Simplicity: Understanding Datacenter Scheduler Programming AbstractionsProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645038(166-177)Online publication date: 7-May-2024
  • (2023)A Reference Architecture for Datacenter Scheduler Programming Abstractions: Design and Experiments (Work In Progress Paper)Companion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3585035(57-63)Online publication date: 15-Apr-2023
  • (2018)StocatorProceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2018.00073(462-471)Online publication date: 1-May-2018
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
HPDC '02: Proceedings of the 11th IEEE International Symposium on High Performance Distributed Computing
July 2002
ISBN:0769516866

Publisher

IEEE Computer Society

United States

Publication History

Published: 24 July 2002

Qualifiers

  • Article

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The Cost of Simplicity: Understanding Datacenter Scheduler Programming AbstractionsProceedings of the 15th ACM/SPEC International Conference on Performance Engineering10.1145/3629526.3645038(166-177)Online publication date: 7-May-2024
  • (2023)A Reference Architecture for Datacenter Scheduler Programming Abstractions: Design and Experiments (Work In Progress Paper)Companion of the 2023 ACM/SPEC International Conference on Performance Engineering10.1145/3578245.3585035(57-63)Online publication date: 15-Apr-2023
  • (2018)StocatorProceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2018.00073(462-471)Online publication date: 1-May-2018
  • (2018)Planning of distributed data production for High Energy and Nuclear PhysicsCluster Computing10.1007/s10586-018-2834-321:4(1949-1965)Online publication date: 1-Dec-2018
  • (2017)Efficient process mapping in geo-distributed cloud data centersProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3126908.3126913(1-12)Online publication date: 12-Nov-2017
  • (2017)Modeling Distributed Platforms from Application Traces for Realistic File Transfer SimulationProceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2017.13(54-63)Online publication date: 14-May-2017
  • (2017)A prediction-based dynamic replication strategy for data-intensive applicationsComputers and Electrical Engineering10.1016/j.compeleceng.2016.11.03657:C(281-293)Online publication date: 1-Jan-2017
  • (2017)Performance Improvement of MapReduce for Heterogeneous Clusters Based on Efficient Locality and Replica Aware Scheduling (ELRAS) StrategyWireless Personal Communications: An International Journal10.1007/s11277-017-3953-595:3(2709-2733)Online publication date: 1-Aug-2017
  • (2017)Generalization of Large-Scale Data Processing in One MapReduce Job for Coarse-Grained ParallelismInternational Journal of Parallel Programming10.1007/s10766-016-0444-345:4(797-826)Online publication date: 1-Aug-2017
  • (2016)The Six Pillars for Building Big Data Analytics EcosystemsACM Computing Surveys10.1145/296314349:2(1-36)Online publication date: 2-Aug-2016
  • Show More Cited By

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media