Stars
codes for Efficient Test-Time Scaling via Self-Calibration
[NeurIPS 2024] SimPO: Simple Preference Optimization with a Reference-Free Reward
codes for Divide, Reweight, and Conquer: A Logit Arithmetic Approach for In-Context Learning
Reading list of hallucination in LLMs. Check out our new survey paper: "Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models"
[ICML 2023] Tuning Language Models as Training Data Generators for Augmentation-Enhanced Few-Shot Learning
[ACL 2023] Reasoning with Language Model Prompting: A Survey
[EMNLP 2021] Distantly-Supervised Named Entity Recognition with Noise-Robust Learning and Language Model Augmented Self-Training
Hierarchical Metadata-Aware Document Categorization under Weak Supervision (WSDM'21)
[KDD 2020] Hierarchical Topic Mining via Joint Spherical Tree and Text Embedding
[EMNLP 2020] Text Classification Using Label Names Only: A Language Model Self-Training Approach
Minimally Supervised Categorization of Text with Metadata (SIGIR'20)
[WWW 2020] Discriminative Topic Mining via Category-Name Guided Text Embedding
The source code, dataset, and evaluation scripts used for SetRank, published in SIGIR 2018
The source code for SetExpan framework, published in ECML-PKDD 2017
PyTorch implementation of paper "Mining Entity Synonyms with Efficient Neural Set Generation" in AAAI 2019
The source code used for automatic taxonomy construction method HiExpan, published in KDD 2018
[NeurIPS 2019] Spherical Text Embedding
[CIKM 2018] Weakly-Supervised Neural Text Classification