[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Hybrid query expansion model for text and microblog information retrieval

Published: 01 August 2018 Publication History

Abstract

Query expansion (QE) is an important process in information retrieval applications that improves the user query and helps in retrieving relevant results. In this paper, we introduce a hybrid query expansion model (HQE) that investigates how external resources can be combined to association rules mining and used to enhance expansion terms generation and selection. The HQE model can be processed in different configurations, starting from methods based on association rules and combining it with external knowledge. The HQE model handles the two main phases of a QE process, namely: the candidate terms generation phase and the selection phase. We propose for the first phase, statistical, semantic and conceptual methods to generate new related terms for a given query. For the second phase, we introduce a similarity measure, ESAC, based on the Explicit Semantic Analysis that computes the relatedness between a query and the set of candidate terms. The performance of the proposed HQE model is evaluated within two experimental validations. The first one addresses the tweet search task proposed by TREC Microblog Track 2011 and an ad-hoc IR task related to the hard topics of the TREC Robust 2004. The second experimental validation concerns the tweet contextualization task organized by INEX 2014. Global results highlighted the effectiveness of our HQE model and of association rules mining for QE combined with external resources.

References

[1]
Aggarwal, N., & Buitelaar, P. (2012). Query expansion using wikipedia and DBpedia. In CLEF evaluation labs and workshop, online working notes, Rome, Italy, September 17–20, 2012, CEUR workshop proceedings (Vol. 1178).
[2]
Agrawal, R., & Skirant, R. (1994). Fast algorithms for mining association rules. In Proceedings of the 20th international conference on very large databases, VLDB 1994, Santiago, Chile (pp. 478–499).
[3]
Agrawal, R., Imielinski, T., & Swami, A. N. (1993). Mining association rules between sets of items in large databases In Proceedings of the 1993 ACM SIGMOD international conference on management of data, Washington, D.C., May 26–28, 1993 (pp. 207–216).
[4]
Al-Shboul B and Myaeng S-H Wikipedia-based query phrase expansion in patent class search Information Retrieval 2014 17 5 430-451
[5]
Almasri, M., Berrut, C., & Chevallet, J. (2013). Wikipedia-based semantic query enrichment. In ESAIR’13, proceedings of the sixth international workshop on exploiting semantic annotations in information retrieval, co-located with CIKM 2013, San Francisco, CA, USA, October 28, 2013 (pp. 5–8).
[6]
Almasri, M., Berrut, C., & Chevallet, J. (2016). A comparison of deep learning based query expansion with pseudo-relevance feedback and mutual information, in advances. In Information retrieval—38th European conference on IR research, ECIR 2016, Padua, Italy, March 20–23, 2016, proceedings (pp. 709–715).
[7]
Bandyopadhyay A, Ghosh K, Majumder P, and Mitra M Query expansion for microblog retrieval IJWS 2012 1 4 368-380
[8]
Barker, K., & Cornacchia, N. (2000). Using noun phrase heads to extract document keyphrases. In Proceedings of the 13th biennial conference of the Canadian society on computational studies of intelligence: advances in artificial intelligence, Springer, London, UK (pp. 40–52).
[9]
Belalem G, Abbache A, Belkredim FZ, and Meziane F Arabic query expansion using wordnet and association rules International Journal of Intelligent Information Technologies 2016 12 3 51-64
[10]
Bellot, P., Moriceau, V., Mothe, J., SanJuan, E., & Tannier, X. (2014). Overview of INEX tweet contextualization 2014 track. In Working notes for CLEF 2014 conference, Sheffield, UK, September 15–18, 2014 (pp. 494–500).
[11]
Bellot P, Moriceau V, Mothe J, SanJuan E, and Tannier X INEX tweet contextualization task: Evaluation, results and lesson learned Information Processing & Management 2016 52 5 801-819
[12]
Bhogal J, MacFarlane A, and Smith RP A review of ontology based query expansion Information Processing & Management 2007 43 4 866-886
[13]
Bouchoucha, A., Liu, X., & Nie, J.-Y. (2014). Integrating multiple resources for diversified query expansion. In Advances in information retrieval: 36th European conference on IR research (ECIR 2014), Amsterdam, The Netherlands, April 13–16, 2014, Springer, Cham (pp. 437–442).
[14]
Buckley, C., Salton, G., Allan, J., & Singhal, A. (1994). Automatic query expansion using SMART: TREC 3. In Proceedings of the third text retrieval conference, TREC 1994, Gaithersburg, Maryland, USA, November 2–4, 1994 (pp. 69–80).
[15]
Cao, G., Nie, J., Gao, J., & Robertson, S. (2008). Selecting good expansion terms for pseudo-relevance feedback. In Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval, SIGIR 2008, Singapore, July 20–24, 2008 (pp. 243–250).
[16]
Carpineto C and Romano G A survey of automatic query expansion in information retrieval ACM Computing Survey 2012 44 1 1
[17]
Chen, Z., & Lu, Y. (2010). Using text classification method in relevance feedback. In Intelligent Information & Database Systems, Second international conference, ACIIDS, Hue City, Vietnam, March 24–26, 2010. Proceedings, Part II (pp. 441–449).
[18]
Colace F, Santo MD, Greco L, and Napoletano P Improving relevance feedback-based query expansion by the use of a weighted word pairs approach JASIST 2015 66 11 2223-2234
[19]
Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using wikipedia-based explicit semantic analysis. In IJCAI 2007, proceedings of the 20th international joint conference on artificial intelligence, Hyderabad, India, January 6–12, 2007 (pp. 1606–1611).
[20]
Gan L and Hong H Improving query expansion for information retrieval using wikipedia International Journal of Database Theory and Application 2015 8 3 27-40
[21]
Gong, C. W., Cheang, L., & Hou, U. (2006). Multi-term web query expansion using WordNet. In S. Bressan, J. Küng, & R. Wagner (Eds.), Database and expert systems applications: 17th international conference (DEXA 2006), Kraków, Poland, September 4–8, 2006, proceedings (pp. 379–388).
[22]
Haddad, H., Chevallet, J. P., & Bruandet, M. F. (2000). Relations between terms discovered by association rules. In Proceedings of the workshop on machine learning and textual information access in conjunction with PKDD 2000, Lyon, France.
[23]
Han J, Pei J, and Yin Y Mining frequent patterns without candidate generation SIGMOD Record 2000 29 2 1-12
[24]
Han L and Chen G Hqe: A hybrid method for query expansion Expert Systems with Applications 2009 36 4 7985-7991
[25]
Ibekwe-Sanjuan, F., & SanJuan, E. (2004). Mining textual data through term variant clustering: The termwatch system. In Computer-assisted information retrieval (Recherche d’Information et ses Applications)—RIAO 2004, 7th international conference, University of Avignon, France, April 26–28, 2004, Proceedings (pp. 487–503).
[26]
Jabeur, L. B., Tamine, L., & Boughanem, M. (2012). Uprising microblogs: A Bayesian network retrieval model for tweet search. In Proceedings of the ACM symposium on applied computing, SAC 2012, Riva, Trento, Italy, March 26–30, 2012 (pp. 943–948).
[27]
Järvelin K, Kekäläinen J, and Niemi T Expansiontool: Concept-based query expansion and construction Information Retrieval 2001 4 3 231-255
[28]
Klyuev V and Haralambous Y A query expansion technique using the EWC semantic relatedness measure Informatica 2011 35 4 401-406
[29]
Ko Y, An H, and Seo J Pseudo-relevance feedback and statistical query expansion for web snippet generation Information Processing Letters 2008 109 1 18-22
[30]
Kwok, K., Grunfeld, L., Sun, H. L., & Deng, P. (2004). TREC 2004 robust track experiments using PIRCS. In Proceedings of the thirteenth text retrieval conference (TREC 2004), Gaithersburg, Maryland, USA, November 16–19, 2004.
[31]
Latiri C, Haddad H, and Hamrouni T Towards an effective automatic query expansion process using an association rule mining approach Journal of Intelligent Information Systems 2012 39 1 209-247
[32]
Lau, C. H., Li, Y., & Tjondronegoro, D. (2011). Microblog retrieval using topical features and query expansion. In Proceedings of the twentieth text retrieval conference (TREC 2011), Gaithersburg, Maryland, November 15–18, 2011.
[33]
Li W Random texts exhibit Zipf’s-law-like word frequency distribution IEEE Transactions on Information Theory 1992 38 6 1842-1845
[34]
Li, Y., Luk, R. W. P., Ho, E. K. S., & Chung, K. F. (2007). Improving weak ad-hoc queries using wikipedia as external corpus. In SIGIR 2007: Proceedings of the 30th annual international ACM SIGIR conference on research and development in information retrieval, Amsterdam, The Netherlands, July 23–27, 2007 (pp. 797–798).
[35]
Liu, C., Qi, R., & Liu, Q. (2013). Query expansion terms based on positive and negative association rules. In IEEE third international conference on information science and technology (ICIST), 2013 (pp. 802–808).
[36]
Luo, J., Meng, B., Liu, M., Tu, X., & Zhang, K. (2012). Query expansion using explicit semantic analysis. In Proceedings of the 4th international conference on internet multimedia computing and service (ICIMCS ’12), ACM, New York, NY, USA (pp. 123–126).
[37]
Lv, C., Qiang, R., Fan, F., & Yang, J. (2015). Knowledge-based query expansion in real-time microblog search. In G. Zuccon, S. Geva, H. Joho, F. Scholer, A. Sun, & P. Zhang (Eds.), Information retrieval technology: 11th asia information retrieval societies conference (AIRS 2015), Brisbane, QLD, Australia, December 2–4, 2015, Springer, Cham (pp. 43–55).
[38]
Macdonald, C., & Ounis, I. (2007). Expertise drift and query expansion in expert search. In Proceedings of the sixteenth ACM conference on information and knowledge management (CIKM 2007), Lisbon, Portugal, November 6–10, 2007 (pp. 341–350).
[39]
Martín-Bautista MJ, Sánchez D, Chamorro-Martínez J, Serrano J, and Vila MA Mining web documents to find additional query terms using fuzzy association rules Fuzzy Sets and Systems 2004 148 1 85-104
[40]
Massoudi, K., Tsagkias, M., de Rijke, M., & Weerkamp, W. (2011). Incorporating query expansion and quality indicators in searching microblog posts. In Advances in information retrieval—33rd European conference on IR research (ECIR 2011), Dublin, Ireland, April 18–21, 2011 (pp. 362–367).
[41]
Meij, E., Weerkamp, W., & de Rijke, M. (2012). Adding semantics to microblog posts. In Proceedings of the fifth international conference on web search and web data mining (WSDM 2012), Seattle, WA, USA, February 8–12, 2012 (pp. 563–572).
[42]
Morchid, M., Dufour, R., & Linéars, G. (2013). LIA@inex2012: Combinaison de thèmes latents pour la contextualisation de tweets, in 13e Conférence Francophone sur l’Extraction et la Gestion des Connaissances. France: Toulouse.
[43]
Ounis, I., Amati, G., Plachouras, V., He, B., Macdonald, C., & Johnson, D. (2005). Terrier information retrieval platform. In Advances in information retrieval, 27th European conference on IR research (ECIR 2005), Santiago de Compostela, Spain, March 21–23, 2005 (pp. 517–519).
[44]
Ounis, I., Macdonald, C., Lin, J., & Soboroff, I. (2011). Overview of the TREC-2011 microblog track. In Proceedings of TREC 2011, http://trec.nist.gov/pubs/trec20/papers/MICROBLOG.OVERVIEW.pdf.
[45]
Selvaretnam, B., Belkhatir, M., & Messom, C. H. (2013). A coupled linguistics/statistical technique for query structure classification and its application to query expansion. In 10th International conference on fuzzy systems and knowledge discovery (FSKD 2013), Shenyang, China, July 23–25, 2013 (pp. 1105–1109).
[46]
Shekarpour, S., Höffner, K., Lehmann, J., & Auer, S. (2013). Keyword query expansion on linked data using linguistic and semantic features. In 2013 IEEE seventh international conference on semantic computing, Irvine, CA, USA, September 16–18, 2013 (pp. 191–197).
[47]
Smucker, M. D., Allan, J., & Carterette, B. (2007). A comparison of statistical significance tests for information retrieval evaluation. In Proceedings of the sixteenth ACM conference on information and knowledge management (CIKM 2007), Lisbon, Portugal, November 6–10, 2007 (pp. 623–632).
[48]
Song M, Song I, Hu X, and Allen RB Integration of association rules and ontologies for semantic query expansion Data & Knowledge Engineering 2007 63 1 63-75
[49]
Tangpong, A., & Rungsawang, A. (2000). Applying association rules discovery in query expansion process. In Proceedings of the 4th world multi-conference on systemics, cybernetics and informatics (SCI 2000), Orlando, Florida, USA.
[50]
Voorhees, E. M. (2004). Overview of TREC 2004. In Proceedings of the thirteenth text retrieval conference (TREC 2004), Gaithersburg, Maryland, USA, November 16–19, 2004.
[51]
Wei, J., Bressan, S., & Ooi, B. C. (2000). Mining term association rules for automatic global query expansion: Methodology and preliminary results. In Proceedings of the first international conference on web information systems engneering (WISE’00).
[52]
Xu, J., & Roft, W. B. (1996). Query expansion using local and global document analysis. In Proceedings of the 19th annual international ACM SIGIR conference, ACM Press, Zurich, Switzerland (pp. 4–11).
[53]
Zaki, M. J., & Hsiao, C. (2002). CHARM: An efficient algorithm for closed association rule mining. In Proceedings of the 2nd SIAM international conference on data mining (SDM 2002), Arlington, VA, USA (pp. 457–473).
[54]
Zingla, M. A., Ettaleb, M., Latiri, C. C., & Slimani, Y. (2014). INEX2014: Tweet contextualization using association rules between terms. In Working notes for CLEF 2014 conference, Sheffield, UK, September 15–18, 2014 (pp. 574–584).
[55]
Zingla, M. A., Latiri, C., Slimani, Y., Berrut, C., & Mulhem, P. (2016). Tweet contextualization approach based on wikipedia and DBpedia. In CORIA 2016—Conférence en Recherche d’Informations et Applications—13th french information retrieval conference. CIFED 2016 Colloque International Francophone sur l’Ecrit et le Document, Toulouse, France, March 9–11, 2016 (pp. 545–560).

Cited By

View all
  • (2024)Query expansion using Haar wavelet transformJournal of Information Science10.1177/0165551522111100550:4(991-1004)Online publication date: 1-Aug-2024
  • (2024)Refining Bridge Engineering-based Construction Scheme Compliance Review with Advanced Large Language Model IntegrationProceedings of the 2024 8th International Conference on Big Data and Internet of Things10.1145/3697355.3697404(297-305)Online publication date: 14-Sep-2024
  • (2024)Evaluation of semantic relations impact in query expansion-based retrieval systemsKnowledge-Based Systems10.1016/j.knosys.2023.111183283:COnline publication date: 11-Jan-2024
  • Show More Cited By

Index Terms

  1. Hybrid query expansion model for text and microblog information retrieval
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Please enable JavaScript to view thecomments powered by Disqus.

      Information & Contributors

      Information

      Published In

      cover image Information Retrieval
      Information Retrieval  Volume 21, Issue 4
      Aug 2018
      115 pages

      Publisher

      Kluwer Academic Publishers

      United States

      Publication History

      Published: 01 August 2018
      Accepted: 21 December 2017
      Received: 10 July 2016

      Author Tags

      1. Information retrieval
      2. Query expansion
      3. Tweets search
      4. Explicit Semantic Analysis
      5. Tweet contextualization
      6. wikipedia
      7. dbpedia
      8. Association rules
      9. Ad-hoc IR task

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Query expansion using Haar wavelet transformJournal of Information Science10.1177/0165551522111100550:4(991-1004)Online publication date: 1-Aug-2024
      • (2024)Refining Bridge Engineering-based Construction Scheme Compliance Review with Advanced Large Language Model IntegrationProceedings of the 2024 8th International Conference on Big Data and Internet of Things10.1145/3697355.3697404(297-305)Online publication date: 14-Sep-2024
      • (2024)Evaluation of semantic relations impact in query expansion-based retrieval systemsKnowledge-Based Systems10.1016/j.knosys.2023.111183283:COnline publication date: 11-Jan-2024
      • (2023)Multi-modal Medical Data Exploration Based on Data LakeHealth Information Science10.1007/978-981-99-7108-4_18(213-222)Online publication date: 23-Oct-2023
      • (2022)Managing and Retrieving Bilingual Documents Using Artificial Intelligence-Based Ontological FrameworkComputational Intelligence and Neuroscience10.1155/2022/46369312022Online publication date: 1-Jan-2022
      • (2022)A reranking-based tweet retrieval approach for planned eventsWorld Wide Web10.1007/s11280-021-00962-825:1(23-47)Online publication date: 1-Jan-2022
      • (2022)A contemporary combined approach for query expansionMultimedia Tools and Applications10.1007/s11042-020-09172-281:24(35195-35221)Online publication date: 1-Oct-2022
      • (2021)Cluster-based information retrieval using pattern miningApplied Intelligence10.1007/s10489-020-01922-x51:4(1888-1903)Online publication date: 1-Apr-2021

      View Options

      View options

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media