[go: up one dir, main page]
More Web Proxy on the site http://driver.im/
Skip to main content

A Cloud Computing Implementation of XML Indexing Method Using Hadoop

  • Conference paper
Intelligent Information and Database Systems (ACIIDS 2012)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 7198))

Included in the following conference series:

Abstract

With the increasing of data at an incredible rate, the development of cloud computing technologies is of critical importance to the advances of researches. The Apache Hadoop has become a widely used open source cloud computing framework that provides a distributed file system for large scale data processing. In this paper, we present a cloud computing implementation of an XML indexing method called NCIM (Node Clustering Indexing Method), which was developed by our research team, for indexing and querying a large number of big XML documents using MapReduce. The experimental results show that NCIM is suitable for cloud computing environment. The throughput of 1200 queries per second for huge amount of queries using a 15-node cluster signifies the potential applications of NCIM to the fast query processing of enormous Internet documents.

This research was partially supported by National Science Council, Taiwan, under contract no. NSC100-2221-E-005-070.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
£29.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
GBP 19.95
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
GBP 35.99
Price includes VAT (United Kingdom)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
GBP 44.99
Price includes VAT (United Kingdom)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Liao, I.-E., Hsu, W.-C., Chen, Y.-L.: An Efficient Indexing and Compressing Scheme for XML Query Processing. In: Zavoral, F., Yaghob, J., Pichappan, P., El-Qawasmeh, E. (eds.) NDT 2010. CCIS, vol. 87, pp. 70–84. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  2. Dutta, H., Kamil, A., Pooleery, M., Sethumadhavan, S., Demme, J.: Distributed Storage of Large Scale Multidimensional Electroencephalogram Data using Hadoop and HBase. In: Grid and Cloud Database Management. Springer, Heidelberg (2011)

    Google Scholar 

  3. Thiébaut, D., Li, Y., Jaunzeikare, D., Cheng, A., Recto, E.R., Riggs, G., Zhao, X.T., Stolpestad, T., Nguyen, C.L.T.: Processing Wikipedia Dumps: A Case-Study comparing the XGrid and MapReduce Approaches. In: 1st International Conference on Cloud Computing and Services Science (2011)

    Google Scholar 

  4. Goldman, R., Widom, J.: DataGuides: Enabling Query Formulation and Optimization in Semistructured Databases. In: 23rd International Conference on Very Large Data Bases, pp. 436–445 (1997)

    Google Scholar 

  5. Al-Khalifa, S., Jagadish, H.V., Koudas, N., Patel, J.M., Srivastava, D., Wu, Y.: Structural Joins: a Primitive for Efficient XML Query Pattern Matching. In: 18th IEEE International Conference on Data Engineering, pp. 141–152. IEEE Press, Washington, DC (2002)

    Google Scholar 

  6. Bruno, N., Koudas, N., Srivastava, D.: Holistic Twig Joins: Optimal XML Pattern Matching. In: 2002 ACM SIGMOD International Conference on Management of Data, pp. 310–321. ACM Press, New York (2002)

    Chapter  Google Scholar 

  7. Chen, S., Li, H.G., Tatemura, J., Hsiung, W.P., Agrawal, D., Candan, K.S.: Twig2Stack: Bottom-Up Processing of Generalized Tree-pattern Queries over XML Documents. In: 32nd International Conference on Very Large Data Bases, pp. 283–294 (2006)

    Google Scholar 

  8. Qin, L., Yu, X.J., Ding, B.: TwigList: Make Twig Pattern Matching Fast. In: 12th International Conference on Database Systems for Advanced Applications, pp. 850–862 (2007)

    Google Scholar 

  9. Pan, Y., Lu, W., Zhang, Y., Chiu, K.: A Static Load-Balancing Scheme for Parallel XML Parsing on Multicore CPUs. In: 7th IEEE International Symposium on Cluster Computing and the Grid, Brazil (2007)

    Google Scholar 

  10. Lu, W., Chiu, K., Pan, Y.: A Parallel Approach to XML Parsing. In: 7th International Conference on Grid Computing, pp. 28–29. IEEE Press, Washington, DC (2006)

    Google Scholar 

  11. Pan, Y., Zhang, Y., Chiu, K.: Simultaneous Transducers for Data-Parallel XML Parsing. In: 22nd IEEE International Parallel and Distributed Processing Symposium (2008)

    Google Scholar 

  12. Pan, Y., Zhang, Y., Chiu, K.: Parsing XML Using Parallel Traversal of Streaming Trees. In: Sadayappan, P., Parashar, M., Badrinath, R., Prasanna, V.K. (eds.) HiPC 2008. LNCS, vol. 5374, pp. 142–156. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  13. Welcome to ApacheTM HadoopTM!, http://hadoop.apache.org/ (retrieved date: June 27, 2011)

  14. Map/Reduce Tutorial, http://hadoop.apache.org/common/docs/r0.20.2/mapred_tutorial.html (retrieved date: June 27, 2011)

  15. Welcome to HadoopTM Distributed File System!, http://hadoop.apache.org/hdfs/ (retrieved date: June 27, 2011)

  16. Wikipedia, Apach Hadoop, http://en.wikipedia.org/wiki/Apache_Hadoop (retrieved date: June 29, 2011)

  17. Zhang, C., De Sterck, H., Aboulnaga, A., Djambazian, H., Sladek, R.: Case Study of Scientific Data Processing on a Cloud Using Hadoop. In: Mewhort, D.J.K., Cann, N.M., Slater, G.W., Naughton, T.J. (eds.) HPCS 2009. LNCS, vol. 5976, pp. 400–415. Springer, Heidelberg (2010)

    Chapter  Google Scholar 

  18. YFilter: Filtering and Transformation for High-Volume XML Message Brokering, http://yfilter.cs.umass.edu/code_release.html (retrieved date: June 29, 2011)

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Hsu, WC., Liao, IE., Shih, HC. (2012). A Cloud Computing Implementation of XML Indexing Method Using Hadoop. In: Pan, JS., Chen, SM., Nguyen, N.T. (eds) Intelligent Information and Database Systems. ACIIDS 2012. Lecture Notes in Computer Science(), vol 7198. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28493-9_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-28493-9_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-28492-2

  • Online ISBN: 978-3-642-28493-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics