[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.5555/2388996.2389105acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems

Published: 10 November 2012 Publication History

Abstract

An important challenge faced by high-throughput, multiscale applications is that human intervention has a central role in driving their success. However, manual intervention is in-efficient, error-prone and promotes resource wasting. This paper presents an application-aware modular framework that provides self-management for computational multiscale applications in volunteer computing (VC). Our framework consists of a learning engine and three modules that can be easily adapted to different distributed systems. The learning engine of this framework is based on our novel tree-like structure called KOTree. KOTree is a fully automatic method that organizes statistical information in a multidimensional structure that can be efficiently searched and updated at runtime. Our empirical evaluation shows that our framework can effectively provide application-aware self-management in VC systems. Additionally, we observed that our KOTree algorithm is able to predict accurately the expected length of new jobs, resulting in an average of 85% increased throughput with respect to other algorithms.

References

[1]
B. J. Adler and T. E. Wainwright. Studies in molecular dynamics. I. general method. J. Chemical Physics, 31:459, 1959.
[2]
S. Hassan, D. Al-Jumeily, and A. J. Hussain. Autonomic computing paradigm to support system's development. In Proc. of the International Conference on Development in eSystems Engineering, 2009.
[3]
P. E. Hart, N. J. Nilsson, and B. Raphael. A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions On Systems Science And Cybernetics, 4(2):100--107, 1968.
[4]
A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. J. of Royal Statistical Society. Series B (Methodological), 39(1):1--38, 1977.
[5]
R. Y. Rubinstein and D. P. Kroese. Simulation and the Monte Carlo Method (Wiley Series in Probability and Statistics). 2 edition.
[6]
S. Guha, A. Meyerson, N. Mishra, and R. Motwani. Clustering data streams: Theory and practice. IEEE TKDE, 15:515--528, 2003.
[7]
P. L. Bartlett, S. Ben-David, and S. R. Kulkarni. Learning changing concepts by exploiting the structure of change. Machine Learning, 41(2):153--174, 2000.
[8]
P. Zhang, B. J. Gao, X. Zhu, and L. Guo. Enabling fast lazy learning for data streams. IEEE International Conference on Data Mining, 0:932--941, 2011.
[9]
H. Yang, H. Liu, and J. He. DELAY: a lazy approach for mining frequent patterns over high speed data streams. In Proc. of the 3rd international conference on Advanced Data Mining and Applications (ADMA07), 2007.
[10]
K. Ueno, X. Xi, E. Keogh, and D.-J. Lee. Anytime classification using the nearest neighbor algorithm with applications to stream mining. In Proc of the 6th International Conference on Data Mining (ICDM '06), 2006.
[11]
S.Z. Erdogan, T. T Bilgin, and J. Cho. Fall detection by using K-nearest neighbor algorithm on WSN data. In Proc. of the IEEE GLOBECOM Workshops (GC Wkshps), 2010.
[12]
H. He, S. Chen, K. Li, and X. Xu. Incremental learning from stream data. IEEE Transactions on Neural Networks, 22(12):1901--1914, 2011.
[13]
Z. Leng, C. Fu, and X. Gu. Stock price index prediction based on mobile data mining. In Proc. of the International Conference on E-Business and E-Government (ICEE), 2010.
[14]
B. Raahemi, A. Kouznetsov, A. Hayajneh, and P. Rabinovitch. Classification of Peer-to-Peer traffic using incremental neural networks (Fuzzy ARTMAP). In Proc. of the Canadian Conference on Electrical and Computer Engineering (CCECE), 2008.
[15]
H. Kawashima, R. R. Sato, and H. Kitagawa. Models and issues on probabilistic data streams with Bayesian Networks. In Proc. of the International Symposium on Applications and the Internet (SAINT), 2008.
[16]
X. Qing, C. Bo-Wei, Z. Chang-Wei, Y. Ping-Gang, and L. Yong-Hong. Study on application of Bayesian classifier model in data stream. In Proc. of the International Conference on Computational and Information Sciences (ICCIS), 2010.
[17]
F. Al Machot, A. H. Mosa, K. Dabbour, A. Fasih, C. Schwarzlmuller, M. Ali, and K. Kyamakya. A novel real-time emotion detection system from audio streams based on bayesian quadratic discriminate classifier for ADAS. In Proc. of the 16th International Symposium on Theoretical Electrical Engineering Nonlinear Dynamics and Synchronization (INDS), 2011.
[18]
P. Domingos and G. Hulten. Mining high-speed data streams. In Proc. of the International Conference on Knowledge Discovery and Data, 2000.
[19]
J. Yang, H. Chen, S. Hariri, and M. Parashar. Autonomic runtime manager for adaptive distributed applications. In Proc. of the 14th IEEE International Symposium on High Performance Distributed Computing (HPDC-14), 2005.
[20]
L. R. Moore, Jr., M. Kopala, T. Mielke, M. Krusmark, and K. A. Gluck. Simultaneous performance exploration and optimized search with volunteer computing. In Proc. of the 19th ACM International Symposium on High Performance Distributed Computing (HPDC), 2010.
[21]
D. P. Anderson. BOINC: A system for public-resource computing and storage. In Proc. of the 5th IEEE/ACM International Workshop on Grid Computing, 2004.
[22]
J. Holger and B. A. Pletsch. Exploiting large-scale correlations to detect continuous gravitational waves. Physical Review Letters, 103(1):1--4, 2009.
[23]
D. J. Rowlands, D. J. Frame, D. Ackerley, T. Aina, B. B. Booth, C. Christensen, M. Collins, N. Faull, C. E. Forest, B. S. Grandey, E. Gryspeerdt, E. J. Highwood, W. J. Ingram, S. Knight, A. Lopez, M. Massey, and F. McNamara. Broad range of 2050 warming from an observationally constrained large climate model ensemble. Nature Geoscience, 5:256--260, 2012.
[24]
M. Taufer, M. Crowley, D. Price, A. Chien, and C. L. Brooks III. Study of an accurate and fast protein-ligand docking algorithm based on molecular dynamics. Concurrency and Computation: Practice and Experience, 17(14):1627--1641, 2005.
[25]
E. Cochran, J. F. Lawrence, C. Christensen, and A. Chung. A novel strong-motion seismic network for community participation in earthquake monitoring. IEEE Instrumentation and Measurement Magazine, 12(8), 2009.
[26]
D. E. Knuth. The Art of Computer Programming, volume 2: Seminumerical Algorithms. Addison-Wesley, 1998.
[27]
B. P. Welford. Note on a method for calculating corrected sums of squares and products. Technometrics, 4:419--420, 1962.
[28]
G. James. Exploration and exploitation in organizational learning. Organization Science, 2(1):71--87, 1991.
[29]
D. C. Montgomery and G. C. Runger. Applied Statistics and probability for engineers, chapter Elements of Statistics II: Inferential Statistics. McGraw-Hill, 2005.
[30]
T. Estrada, M. Taufer, and K. Reed. Modeling job lifespan delays in volunteer computing projects. In Proc. of the 9th IEEE International Symposium on Cluster Computing and Grid (CCGrid), 2009.
[31]
T. Estrada, M. Taufer, and D. P. Anderson. Performance prediction and analysis of boinc projects: An empirical study with emboinc. J. Grid Computing, 7(4):537--554, 2009.
[32]
R. H. C. Lopes, I. Reid, and P. R. Hobson. The two-dimensional Kolmogorov-Smirnov test. In Proc. of the International Workshop on Advanced Computing and Analysis Techniques in Physics Research, 2007.

Cited By

View all
  • (2018)KeyBin2Proceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225149(1-10)Online publication date: 13-Aug-2018
  1. On the effectiveness of application-aware self-management for scientific discovery in volunteer computing systems

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SC '12: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
    November 2012
    1161 pages
    ISBN:9781467308045

    Sponsors

    Publisher

    IEEE Computer Society Press

    Washington, DC, United States

    Publication History

    Published: 10 November 2012

    Check for updates

    Qualifiers

    • Research-article

    Conference

    SC '12
    Sponsor:

    Acceptance Rates

    SC '12 Paper Acceptance Rate 100 of 461 submissions, 22%;
    Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 25 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)KeyBin2Proceedings of the 47th International Conference on Parallel Processing10.1145/3225058.3225149(1-10)Online publication date: 13-Aug-2018

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media