Cluster Labeling Extraction and Ranking Feature Selection for High Quality XML Pseudo Relevance Feedback Fragments Set

Minjuan Zhong²⁵,
Changxuan Wan²⁵,
Dexi Liu²⁵,
Shumei Liao²⁵ &
…
Siwen Luo²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8347))

Included in the following conference series:

International Conference on Advanced Data Mining and Applications

Abstract

In traditional pseudo feedback, the main reason of the topic drift is the low quality of the feedback source. Clustering search results is an effective way to improve the quality of feedback set. For XML data, how to effectively perform clustering algorithm and then identify good xml fragments from the clustering results is a intricate problem. This paper mainly focus on the latter problem. Based on k-mediod clustering results, This work firstly proposes an cluster label extraction method to select candidate relevant clusters. Secondly, multiple ranking features are introduced to assist the related xml fragments identification from the candidate clusters. Top N fragments compose the high quality pseudo feedback set finally. Experimental results on standard INEX test data show that in one hand, the proposed cluster label extraction method could obtain proper cluster key terms and lead to appropriate candidate cluster selection. On the other hand, the presented ranking features are beneficial to the relevant xml fragments identification. The quality of feedback set is ensured.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

£29.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: GBP 19.95; Price includes VAT (United Kingdom)

eBook: GBP 35.99; Price includes VAT (United Kingdom)

Softcover Book: GBP 44.99; Price includes VAT (United Kingdom)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

An Effective Framework for Identifying Good XML Feedback Fragments

An effective approach for semantic-based clustering and topic-based ranking of web documents

Article 15 March 2018

XPloreRank: exploring XML data via you may also like queries

Article 11 August 2018

References

Kyung, S.L., Croft, W.B., James, A.: A Cluster-Based Resampling Method for Pseudo-Relevance Feedback. In: Proc. of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 235–242. ACM Press, New York (2008)
Google Scholar
Ben, H., Ladh, O.: Finding Good Feedback Documents. In: Proc. of the 18th ACM Conf. on Information and Knowledge Management (CIKM), pp. 2011–2014. ACM Press, New York (2009)
Google Scholar
Raman, K., Udupa, R., Bhattacharya, P., Bhole, A.: On Improving Pseudo-Relevance Feedback Using Pseudo-Irrelevant Documents. In: Gurrin, C., He, Y., Kazai, G., Kruschwitz, U., Little, S., Roelleke, T., Rüger, S., van Rijsbergen, K. (eds.) ECIR 2010. LNCS, vol. 5993, pp. 573–576. Springer, Heidelberg (2010)
Chapter Google Scholar
Sakai, T., Manabe, T., Koyama, M.: Flexible Pseudo-Relevance Feedback via Selective Sampling. ACM Transactions on Asian Language Information Processing 4(2), 111–135 (2005)
Article Google Scholar
Shariq, B., Andreas, B.: Improving Retrievability of Patents with Cluster-Based Pseudo-Relevance Feedback Document Selection. In: Proc. of the 18th ACM Conf. on Information and Knowledge Management (CIKM), pp. 1863–1866. ACM Press, New York (2009)
Google Scholar
Kevyn, C.T., Jamie, C.: Estimation and Use of Uncertainty in Pseudo-Relevance Feedback. In: Proc. of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 303–310. ACM Press, New York (2007)
Google Scholar
Zhong, M.: Combining Term Semantics with Content and Structure Semantics for XML Element Search Results Clustering. Journal of Convergence Information Technology 7(15), 26–35 (2012)
Article Google Scholar
Carnegie Mellon University and the University of Massachusetts. INDRI: Language Modeling Meets Inference Networks (March 2010), http://www.lemurproject.org/indri/

Download references

Author information

Authors and Affiliations

School of Information Technology, Jiangxi Key Laboratory of Data and Knowledge Engineering, Jiangxi University of Finance and Economics, Nanchang, China
Minjuan Zhong, Changxuan Wan, Dexi Liu, Shumei Liao & Siwen Luo

Authors

Minjuan Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Changxuan Wan
View author publications
You can also search for this author in PubMed Google Scholar
Dexi Liu
View author publications
You can also search for this author in PubMed Google Scholar
Shumei Liao
View author publications
You can also search for this author in PubMed Google Scholar
Siwen Luo
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

US Air Force Office of Scientific Research, 106-0032, Tokyo, Japan
Hiroshi Motoda
School of Computer Science and Technology, Zhejiang University, 310027, Hangzhou, China
Zhaohui Wu
Faculty of Engineering and Information Technology, University of Technology, Chippendale, 2008, Sydney, NSW, Australia
Longbing Cao
Department of Computing Science, Edmonton, University of Alberta, T6G 2E8, Canada
Osmar Zaiane
College of Computer Science and Technology, Zhejiang University, Hangzhou, China
Min Yao
School of Computer Science, Fudan University, 200433, Shanghai, China
Wei Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhong, M., Wan, C., Liu, D., Liao, S., Luo, S. (2013). Cluster Labeling Extraction and Ranking Feature Selection for High Quality XML Pseudo Relevance Feedback Fragments Set. In: Motoda, H., Wu, Z., Cao, L., Zaiane, O., Yao, M., Wang, W. (eds) Advanced Data Mining and Applications. ADMA 2013. Lecture Notes in Computer Science(), vol 8347. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-53917-6_38

Download citation

DOI: https://doi.org/10.1007/978-3-642-53917-6_38
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-53916-9
Online ISBN: 978-3-642-53917-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics