[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1109/CSNT.2014.124guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Performance Analysis of MapReduce Task with Large Number of Files Dataset in Big Data Using Hadoop

Published: 07 April 2014 Publication History

Abstract

Big Data is a huge amount of data that cannot be managed by the traditional data management system. Hadoop is a technological answer to Big Data. Hadoop Distributed File System (HDFS) and MapReduce programming model is used for storage and retrieval of the big data. The Tera Bytes size file can be easily stored on the HDFS and can be analyzed with MapReduce. This paper provides introduction to Hadoop HDFS and MapReduce for storing large number of files and retrieve information from these files. In this paper we present our experimental work done on Hadoop by applying a number of files as input to the system and then analyzing the performance of the Hadoop system. We have studied the amount of bytes written and read by the system and by the MapReduce. We have analyzed the behavior of the map method and the reduce method with increasing number of files and the amount of bytes written and read by these tasks.

Cited By

View all
  • (2016)Efficient Batch Processing of Related Big Data Tasks using Persistent MapReduce TechniqueProceedings of the Third International Symposium on Computer Vision and the Internet10.1145/2983402.2983431(106-109)Online publication date: 21-Sep-2016

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
CSNT '14: Proceedings of the 2014 Fourth International Conference on Communication Systems and Network Technologies
April 2014
1199 pages
ISBN:9781479930708

Publisher

IEEE Computer Society

United States

Publication History

Published: 07 April 2014

Author Tags

  1. Data Node
  2. HDFS
  3. Hadoop
  4. Job Tracker
  5. MapReduce
  6. Name Node
  7. Secondary Name Node
  8. Task Tracker
  9. Teragen
  10. Terasort
  11. Teravalidate

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Efficient Batch Processing of Related Big Data Tasks using Persistent MapReduce TechniqueProceedings of the Third International Symposium on Computer Vision and the Internet10.1145/2983402.2983431(106-109)Online publication date: 21-Sep-2016

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media