Starred repositories
2020语言与智能技术竞赛:关系抽取任务
SIGIR-2022 Webformer: Pre-training with Web Pages for Information Retrieval
程序员延寿指南 | A programmer's guide to live longer
SIGIR'22 paper: Axiomatically Regularized Pre-training for Ad hoc Search
Source code for paper Improving Session Search by Modeling Multi-Granularity Historical Query Change
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
Source code of SIGIR2021 Paper 'One Chatbot Per Person: Creating Personalized Chatbots based on Implicit Profiles'
The released codes for ACL 2021 paper 'BoB: BERT Over BERT for Training Persona-based Dialogue Models from Limited Personalized Data'
NLPIR tutorial: pretrain for IR. pre-train on raw textual corpus, fine-tune on MS MARCO Document Ranking
Official implementation of our ICLR 2018 and SIGIR 2019 papers on Context-aware Neural Information Retrieval
CIKM 2021: Contrastive Learning of User Behavior Sequence for Context-Aware Document Ranking
Dataset and codes for the paper "Product-oriented Machine Translation with Cross-modal Cross-lingual Pre-training".
Source code of CIKM2021 Paper 'Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need'
A comprehensive mapping database of English to Chinese technical vocabulary in the artificial intelligence domain
PyTorch Implementation for AAAI'21 "Do Response Selection Models Really Know What's Next? Utterance Manipulation Strategies for Multi-turn Response Selection"
DeepCT and HDCT uses BERT to generate novel, context-aware bag-of-words term weights for documents and queries.
Code for CEDR: Contextualized Embeddings for Document Ranking, accepted at SIGIR 2019.