Learning to Rank Microblog Posts for Real-Time Ad-Hoc Search

Jing Li^23,24,
Zhongyu Wei²⁵,
Hao Wei²³,
Kangfei Zhao²³,
Junwen Chen²⁶ &
…
Kam-Fai Wong^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9362))

Included in the following conference series:

CCF International Conference on Natural Language Processing and Chinese Computing

2362 Accesses
3 Citations

Abstract

Microblogging websites have emerged to the center of information production and diffusion, on which people can get useful information from other users’ microblog posts. In the era of Big Data, we are overwhelmed by the large amount of microblog posts. To make good use of these informative data, an effective search tool is required specialized for microblog posts. However, it is not trivial to do microblog search due to the following reasons: 1) microblog posts are noisy and time-sensitive rendering general information retrieval models ineffective. 2) Conventional IR models are not designed to consider microblog-specific features. In this paper, we propose to utilize learning to rank model for microblog search. We combine content-based, microblog-specific and temporal features into learning to rank models, which are found to model microblog posts effectively. To study the performance of learning to rank models, we evaluate our models using tweet data set provided by TERC 2011 and TREC 2012 microblogs track with the comparison of three state-of-the-art information retrieval baselines, vector space model, language model, BM25 model. Extensive experimental studies demonstrate the effectiveness of learning to rank models and the usefulness to integrate microblog-specific and temporal information for microblog search task.

This work is partially supported by General Research Fund of Hong Kong (417112), RGC Direct Grant (417613), and Huawei Noah’s Ark Lab, Hong Kong. We would like to thank Junjie Hu, Prof. Michael R. Lyu and anonymous reviewers for the useful comments. This work was done when Zhongyu Wei and Junwen Chen were at The Chinese University of Hong Kong.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Feature Analysis in Microblog Retrieval Based on Learning to Rank

Learn to Rank Tweets by Integrating Query-Specific Characteristics

Query Dependent Time-Sensitive Ranking Model for Microblog Search

References

Dang, V.: Ranklib (2013)
Google Scholar
Duan, Y., Jiang, L., Qin, T., Zhou, M., Shum, H.Y.: An empirical study on learning to rank of tweets. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 295–303. Association for Computational Linguistics (2010)
Google Scholar
Freund, Y., Iyer, R., Schapire, R.E., Singer, Y.: An efficient boosting algorithm for combining preferences. The Journal of Machine Learning Research 4, 933–969 (2003)
MathSciNet MATH Google Scholar
Friedman, J.H.: Greedy function approximation: a gradient boosting machine. Annals of Statistics, 1189–1232 (2001)
Google Scholar
Han, Z., Li, X., Yang, M., Qi, H., Li, S., Zhao, T.: Hit at trec 2012 microblog track. In: Proceedings of the 21st Text REtrieval Conference (TREC) (2012)
Google Scholar
Hang, L.: A short introduction to learning to rank. IEICE Transactions on Information and Systems 94(10), 1854–1862 (2011)
Google Scholar
Lin, L., Efron, M.: Overview of the trec-2013 microblog track. In: Proceedings of the 23rd Text REtrieval Conference (TREC) (2013)
Google Scholar
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval, vol. 1. Cambridge university press, Cambridge (2008)
Book MATH Google Scholar
Metzler, D., Cai, C.: USC/ISI at trec 2011: microblog track. In: TREC (2011)
Google Scholar
Metzler, D., Croft, W.B.: Linear feature-based models for information retrieval. Information Retrieval 10(3), 257–274 (2007)
Article Google Scholar
Obukhovskaya, Z., Pervyshev, K., Styskin, A., Serdyukov, P.: Yandex at trec 2011 microblog track. In: Proceedings of the 20th Text REtrieval Conference (TREC) (2011)
Google Scholar
Ounis, I., Macdonald, C., Lin, J., Soboroff, I.: Overview of the trec-2011 microblog track. In: Proceedings of the 20th Text REtrieval Conference (TREC) (2011)
Google Scholar
Soboroff, I., Ounis, I., Lin, J., Soboroff, I.: Overview of the trec-2012 microblog track. In: Proceedings of the 21st Text REtrieval Conference (TREC) (2012)
Google Scholar
Wang, Y., Lin, J.: The impact of future term statistics in real-time tweet search. In: de Rijke, M., Kenter, T., de Vries, A.P., Zhai, C.X., de Jong, F., Radinsky, K., Hofmann, K. (eds.) ECIR 2014. LNCS, vol. 8416, pp. 567–572. Springer, Heidelberg (2014)
Chapter Google Scholar
Wu, Q., Burges, C.J., Svore, K.M., Gao, J.: Adapting boosting for information retrieval measures. Information Retrieval 13(3), 254–270 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

The Chinese University of Hong Kong, Shatin, N.T., Hong Kong
Jing Li, Hao Wei, Kangfei Zhao & Kam-Fai Wong
MoE Key Laboratory of High Confidence Software Technologies, Beijing, China
Jing Li & Kam-Fai Wong
The University of Texas at Dallas, Richardson, TX, USA
Zhongyu Wei
Tencent, Nanshan District, Shenzhen, China
Junwen Chen

Authors

Jing Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhongyu Wei
View author publications
You can also search for this author in PubMed Google Scholar
Hao Wei
View author publications
You can also search for this author in PubMed Google Scholar
Kangfei Zhao
View author publications
You can also search for this author in PubMed Google Scholar
Junwen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Kam-Fai Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jing Li .

Editor information

Editors and Affiliations

Tsinghua University, Beijing, China
Juanzi Li
Rensselaer Polytechnic Institute, Troy, NY, USA
Heng Ji
Peking University, Beijing, China
Dongyan Zhao
Peking University, Beijing, China
Yansong Feng

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, J., Wei, Z., Wei, H., Zhao, K., Chen, J., Wong, KF. (2015). Learning to Rank Microblog Posts for Real-Time Ad-Hoc Search. In: Li, J., Ji, H., Zhao, D., Feng, Y. (eds) Natural Language Processing and Chinese Computing. NLPCC 2015. Lecture Notes in Computer Science(), vol 9362. Springer, Cham. https://doi.org/10.1007/978-3-319-25207-0_40

Download citation

DOI: https://doi.org/10.1007/978-3-319-25207-0_40
Published: 20 October 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25206-3
Online ISBN: 978-3-319-25207-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning to Rank Microblog Posts for Real-Time Ad-Hoc Search

Abstract

Access this chapter

Preview

Similar content being viewed by others

Feature Analysis in Microblog Retrieval Based on Learning to Rank

Learn to Rank Tweets by Integrating Query-Specific Characteristics

Query Dependent Time-Sensitive Ranking Model for Microblog Search

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning to Rank Microblog Posts for Real-Time Ad-Hoc Search

Abstract

Access this chapter

Preview

Similar content being viewed by others

Feature Analysis in Microblog Retrieval Based on Learning to Rank

Learn to Rank Tweets by Integrating Query-Specific Characteristics

Query Dependent Time-Sensitive Ranking Model for Microblog Search

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation