[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Analyzing Hadoop power consumption and impact on application QoS

Published: 01 February 2016 Publication History

Abstract

Energy efficiency is often identified as one of the key reasons for migrating to Cloud environments. It is stated that a data center hosting the Cloud environment is likely to achieve greater energy efficiency (at a reduced cost) compared to a local deployment. With increasing energy prices, it is also estimated that a large percentage of operational costs within a Cloud environment can be attributed to energy. In this work, we investigate and measure energy consumption of a number of virtual machines running the Hadoop system, over an OpenNebula Cloud. Our workload is based on sentiment analysis undertaken over Twitter messages. Our objective is to understand the tradeoff between energy efficiency and performance for such a workload. From our results we generalize and speculate on how such an analysis could be used as a basis to establish a Service Level Agreement (SLA) with a Cloud provider-especially where there is likely to be a high level of variability (both in performance and energy use) over multiple runs of the same application (at different times). Among the service level objectives that might be included in a SLA, Quality of Service (QoS) related metrics (i.e., latency) are one of the most challenging to support. This work provides some insight on the relationship between power consumption and QoS related metrics, describing how a combined consideration of these two metrics could be supported for a particular workload. Power consumption characterization of Hadoop Clouds (with a social media use case).Study of the QoS related to power consumption (in terms of processing time).Experimentation on two different Cloud infrastructures (single node-multi node).OpenNebula based private Cloud environments.

References

[1]
EU Optimis Project, Web page at http://www.optimis-project.eu/ ¿(Last access: 27.06.14).
[2]
S. Garg, R. Buyya, Green cloud computing and environmental sustainability, in: Harnessing Green IT: Principles and Practices, Wiley Press, UK, 2012.
[3]
C. Lam, Hadoop in Action, Manning Publications, 2010.
[4]
T. White, Hadoop: The Definitive Guide, O'Reilly, 2009.
[5]
J. Dean, S. Ghemawat, MapReduce: simplified data processing on large clusters, Commun. ACM, 51 (2008) 107-113.
[6]
CloudSuite 1.0, Web page at http://parsa.epfl.ch/cloudsuite/cloudsuite.html(Last access: 26.06.14).
[7]
Cardiff On-line Social Media Observatory (COSMOS), Web page at http://www.cs.cf.ac.uk/cosmos/ ¿(Last access: 20.06.14).
[8]
B. Pang, L. Lee, Opinion mining and sentiment analysis, Found. Trends Inf. Retr., 2 (2008) 1-135.
[9]
SentiStrength: The sentiment strength detection in short texts, Web page at http://sentistrength.wlv.ac.uk/ ¿(Last access: 20.06.14).
[10]
M. Thelwall, K. Buckley, G. Paltoglou, Sentiment strength detection for the social web, J. Am. Soc. Inf. Sci. Technol., 63 (2012) 163-173.
[11]
D.D. Sood, S. Kumar, Cloud computing & green IT, Tech. Rep., 2010.
[12]
UPS Selector Sizing Application, Web page at http://www.apc.com/template/size/apc/ ¿(Last access: 22.06.14).
[13]
Green IT Calculator, Web page at http://www.vmware.com/solutions/green/calculator.html(Last access: 22.06.14).
[14]
L. Liu, H. Wang, X. Liu, X. Jin, W.B. He, Q.B. Wang, Y. Chen, GreenCloud: a new architecture for green data center, in: Proceedings of the 6th International Conference Industry Session on Autonomic Computing and Communications Industry Session, ACM, New York, NY, USA, 2009, pp. 29-38.
[15]
M. Ghamkhari, H. Mohsenian-Rad, Optimal integration of renewable energy resources in data centers with behind-the-meter renewable generator, in: Proc. of the IEEE International Conference in Communications, ICC'2012, Ottawa, Canada, 2012.
[16]
Eco4Cloud Project, Web page at http://www.eco4cloud.com/eco4cloud ¿(Last access: 27.06.14).
[17]
J. Leverich, C. Kozyrakis, On the energy (in)efficiency of hadoop clusters, SIGOPS Oper. Syst. Rev., 44 (2010) 61-65.
[18]
I. Goiri, K. Le, T.D. Nguyen, J. Guitart, J. Torres, R. Bianchini, GreenHadoop: leveraging green energy in data-processing frameworks, in: Proceedings of the 7th ACM European Conference on Computer Systems, ACM, New York, NY, USA, 2012, pp. 57-70.
[19]
R.T. Kaushik, M. Bhandarkar, GreenHDFS: towards an energy-conserving, storage-efficient, hybrid hadoop compute cluster, in: Proceedings of the 2010 International Conference on Power Aware Computing and Systems, USENIX Association, Berkeley, CA, USA, 2010, pp. 1-9.
[20]
K. Bilal, S.U.R. Malik, O. Khalid, A. Hameed, E. Alvarez, V. Wijaysekara, R. Irfan, S. Shrestha, D. Dwivedy, M. Ali, U.S. Khan, A. Abbas, N. Jalil, S.U. Khan, A taxonomy and survey on green data center networks, Future Gener. Comput. Syst., 36 (2014) 189-208.
[21]
B. Shi, A. Srivastava, Thermal and power-aware task scheduling for hadoop based storage centric ¿data centers, in: Proceedings of the International Conference on Green Computing, IEEE Computer Society, Washington, DC, USA, 2010, pp. 73-83.
[22]
G. Laszewski, L. Wang, GreenIT service level agreements, in: Grids and Service-Oriented Architectures for Service Level Agreements, Springer, US, 2010, pp. 77-88.
[23]
Green Grid Association, Web page at http://www.thegreengrid.org/ ¿(Last access: 21.06.14).
[24]
OpenNebula: The Open Source Solution for Data Center Virtualization, Web page at http://opennebula.org/ ¿(Last access: 23.06.14).
[25]
Intel Xeon Processor e5 Family, Web page at http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-5000-sequence.html ¿(Last access: 23.06.14).
[26]
CentOS: The Community ENTerprise Operating System, Web page at http://www.centos.org/ ¿(Last access: 23.06.14).
[27]
Kernel Based Virtual Machine (KVM), Web page at http://www.linux-kvm.org/ ¿(Last access: 23.06.14).
[28]
R. Rivest, The MD5 message-digest algorithm, RFC 1321 (Informational), updated by RFC 6151, April 1992. URL: http://www.ietf.org/rfc/rfc1321.txt.

Cited By

View all
  • (2024)Energy Measurement System for Data Lake: An Initial ApproachIntelligent Information and Database Systems10.1007/978-981-97-4982-9_2(15-27)Online publication date: 15-Apr-2024
  • (2021)Recent advances on the application of big data framework based on Hadoop platformProceedings of the 2021 International Conference on Bioinformatics and Intelligent Computing10.1145/3448748.3448782(214-219)Online publication date: 22-Jan-2021
  • (2021)FEPAC: A Framework for Evaluating Parallel Algorithms on Cluster ArchitecturesProceedings of the 2021 Australasian Computer Science Week Multiconference10.1145/3437378.3444363(1-10)Online publication date: 1-Feb-2021
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Future Generation Computer Systems
Future Generation Computer Systems  Volume 55, Issue C
February 2016
547 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2016

Author Tags

  1. Cloud computing
  2. Hadoop
  3. OpenNebula
  4. Power consumption
  5. Social media analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Energy Measurement System for Data Lake: An Initial ApproachIntelligent Information and Database Systems10.1007/978-981-97-4982-9_2(15-27)Online publication date: 15-Apr-2024
  • (2021)Recent advances on the application of big data framework based on Hadoop platformProceedings of the 2021 International Conference on Bioinformatics and Intelligent Computing10.1145/3448748.3448782(214-219)Online publication date: 22-Jan-2021
  • (2021)FEPAC: A Framework for Evaluating Parallel Algorithms on Cluster ArchitecturesProceedings of the 2021 Australasian Computer Science Week Multiconference10.1145/3437378.3444363(1-10)Online publication date: 1-Feb-2021
  • (2020)Profit-Maximized Task Offloading with Simulated-annealing-based Migrating Birds Optimization in Hybrid Cloud-Edge Systems2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC42975.2020.9283467(1218-1223)Online publication date: 11-Oct-2020
  • (2018)New Horizons of Cloud ComputingFuture Generation Computer Systems10.1016/j.future.2015.11.00755:C(163-164)Online publication date: 30-Dec-2018
  • (2018)Popularity-based covering sets for energy proportionality in shared-nothing clustersThe Journal of Supercomputing10.1007/s11227-017-2197-174:5(1885-1910)Online publication date: 1-May-2018

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media