Applied Filters
- Yu Meng
- AuthorRemove filter
People
Colleagues
- Yu Meng (30)
- J. Han (27)
- Jiaxin Huang (17)
- Yu Zhang (15)
- Yunyi Zhang (7)
- Chao Zhang (4)
- Jiaming Shen (4)
- Xiusi Chen (3)
- Chao Zhang (2)
- Guangyuan Wang (2)
- Jingbo Shang (2)
- Martin Michalski (2)
- Yiqing Xie (2)
- Dacheng Tao (1)
- Donald Metzler (1)
- George Karypis (1)
- Hady Wirawan Lauw (1)
- Josiah Poon (1)
- Julia Stoyanovich (1)
- Lance Michael Kaplan (1)
Publication
Proceedings/Book Names
- KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining (3)
- KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining (2)
- KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2)
- KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2)
- WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining (2)
- WWW '20: Proceedings of The Web Conference 2020 (2)
- WWW '23: Proceedings of the ACM Web Conference 2023 (2)
- AAAI'19/IAAI'19/EAAI'19: Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence (1)
- CIKM '18: Proceedings of the 27th ACM International Conference on Information and Knowledge Management (1)
- ICML'23: Proceedings of the 40th International Conference on Machine Learning (1)
- NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems (1)
- NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems (1)
- NIPS'19: Proceedings of the 33rd International Conference on Neural Information Processing Systems (1)
- SIGIR '20: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval (1)
- WSC '21: Proceedings of the Winter Simulation Conference (1)
- WSDM '20: Proceedings of the 13th International Conference on Web Search and Data Mining (1)
- WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining (1)
- WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining (1)
- WWW '22: Proceedings of the ACM Web Conference 2022 (1)
- WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023 (1)
Publication Date
Export Citations
Publications
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-article
COCO-LM: correcting and contrasting text sequences for language model pretraining
- Yu Meng
University of Illinois at Urbana-Champaign
, - Chenyan Xiong
Microsoft
, - Payal Baja
Microsoftj
, - Saurabh Tiwary
Microsoft
, - Paul Bennett
Microsoft
, - Jiawei Han
University of Illinois at Urbana-Champaign
, - Xia Song
Microsoft
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems•December 2021, Article No.: 1769, pp 23102-23114We present a self-supervised learning framework, COCO-LM, that pretrains Language Models by COrrecting and COntrasting corrupted text sequences. Following ELECTRA-style pretraining, COCO-LM employs an auxiliary language model to corrupt text sequences, ...
- 0Citation
MetricsTotal Citations0- 1
Supplementary Material3540261.3542030_supp.pdf
- Yu Meng
- research-article
Large language model as attributed training data generator: a tale of diversity and bias
- Yue Yu
Georgia Tech
, - Yuchen Zhuang
Georgia Tech
, - Jieyu Zhang
University of Washington
, - Yu Meng
UIUC
, - Alexander Ratner
University of Washington
, - Ranjay Krishna
University of Washington
, - Jiaming Shen
Google Research
, - Chao Zhang
Georgia Tech
NIPS '23: Proceedings of the 37th International Conference on Neural Information Processing Systems•December 2023, Article No.: 2433, pp 55734-55784Large language models (LLMs) have been recently leveraged as training data generators for various natural language processing (NLP) tasks. While previous research has explored different approaches to training models using generated data, they generally ...
- 0Citation
MetricsTotal Citations0
- Yue Yu
- research-article
Generating training data with language models: towards zero-shot language understanding
- Yu Meng
Department of Computer Science, University of Illinois at Urbana-Champaign
, - Jiaxin Huang
Department of Computer Science, University of Illinois at Urbana-Champaign
, - Yu Zhang
Department of Computer Science, University of Illinois at Urbana-Champaign
, - Jiawei Han
Department of Computer Science, University of Illinois at Urbana-Champaign
NIPS '22: Proceedings of the 36th International Conference on Neural Information Processing Systems•November 2022, Article No.: 34, pp 462-477Pretrained language models (PLMs) have demonstrated remarkable performance in various natural language processing tasks: Unidirectional PLMs (e.g., GPT) are well known for their superior text generation capabilities; bidirectional PLMs (e.g., BERT) have ...
- 0Citation
MetricsTotal Citations0- 1
Supplementary Material3600270.3600304_supp.pdf
- Yu Meng
- abstractPublic AccessPublished By ACMPublished By ACM
Pretrained Language Representations for Text Understanding: A Weakly-Supervised Perspective
- Yu Meng
UIUC, Urbana, USA
, - Jiaxin Huang
UIUC, Urbana, USA
, - Yu Zhang
UIUC, Urbana, USA
, - Yunyi Zhang
UIUC, Urbana, USA
, - Jiawei Han
UIUC, Urbana, USA
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining•August 2023, pp 5817-5818• https://doi.org/10.1145/3580305.3599569Language representations pretrained on general-domain corpora and adapted to downstream task data have achieved enormous success in building natural language understanding (NLU) systems. While the standard supervised fine-tuning of pretrained language ...
- 0Citation
- 505
- Downloads
MetricsTotal Citations0Total Downloads505Last 12 Months231Last 6 weeks19
- Yu Meng
- research-articlePublic AccessPublished By ACMPublished By ACM
Weakly Supervised Multi-Label Classification of Full-Text Scientific Papers
- Yu Zhang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Bowen Jin
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Xiusi Chen
University of California, Los Angeles, Los Angeles, CA, USA
, - Yanzhen Shen
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Yunyi Zhang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Urbana, IL, USA
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining•August 2023, pp 3458-3469• https://doi.org/10.1145/3580305.3599544Instead of relying on human-annotated training samples to build a classifier, weakly supervised scientific paper classification aims to classify papers only using category descriptions (e.g., category names, category-indicative keywords). Existing ...
- 1Citation
- 621
- Downloads
MetricsTotal Citations1Total Downloads621Last 12 Months491Last 6 weeks57- 1
Supplementary Materialrtfp0914-2min-promo.mp4
- Yu Zhang
- research-article
Tuning language models as training data generators for augmentation-enhanced few-shot learning
- Yu Meng
University of Illinois Urbana-Champaign
, - Martin Michalski
University of Illinois Urbana-Champaign
, - Jiaxin Huang
University of Illinois Urbana-Champaign
, - Yu Zhang
University of Illinois Urbana-Champaign
, - Tarek Abdelzaher,
- Jiawei Han
ICML'23: Proceedings of the 40th International Conference on Machine Learning•July 2023, Article No.: 1018, pp 24457-24477Recent studies have revealed the intriguing few-shot learning ability of pretrained language models (PLMs): They can quickly adapt to a new task when fine-tuned on a small amount of labeled data formulated as prompts, without requiring abundant task-...
- 0Citation
MetricsTotal Citations0
- Yu Meng
- tutorialOpen AccessPublished By ACMPublished By ACM
Tutorials at The Web Conference 2023
- Valeria Fionda
University of Calabria, Italy
, - Olaf Hartig
Linköping University, Sweden
, - Reyhaneh Abdolazimi
Syracuse University, USA
, - Sihem Amer-Yahia
CNRS, Univ. Grenoble Alpes, France
, - Hongzhi Chen
AWS Shanghai AI Lab, China
, - Xiao Chen
The Hong Kong Polytechnic University, Hong Kong
, - Peng Cui
Tsinghua University, China
, - Jeffrey Dalton
University of Glasgow, United Kingdom
, - Xin Luna Dong
Meta Reality Labs, USA
, - Lisette Espin-Noboa
Central European University & Complexity Science Hub Vienna, Austria
, - Wenqi Fan
The Hong Kong Polytechnic University, Hong Kong
, - Manuela Fritz
University of Passau, Germany
, - Quan Gan
AWS Shanghai AI Lab, China
, - Jingtong Gao
City University of Hong Kong, Hong Kong
, - Xiaojie Guo
IBM T. J. Watson Research Center, USA
, - Torsten Hahmann
University of Maine, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, USA
, - Soyeon Han
The University of Sydney, Australia
, - Estevam Hruschka
Megagon Labs, USA
, - Liang Hu
Tongji University, China
, - Jiaxin Huang
University of Illinois at Urbana-Champaign, USA
, - Utkarshani Jaimini
University of South Carolina, USA
, - Olivier Jeunen
ShareChat, United Kingdom
, - Yushan Jiang
University of Connecticut, USA
, - Fariba Karimi
Vienna University of Technology & Complexity Science Hub Vienna, Austria
, - George Karypis
AWS AI Research and Education, USA
, - Krishnaram Kenthapadi
Fiddler AI, USA
, - Himabindu Lakkaraju
Harvard University, USA
, - Hady W. Lauw
Singapore Management University, Singapore
, - Thai Le
The University of Mississippi, USA
, - Trung-Hoang Le
Singapore Management University, Singapore
, - Dongwon Lee
The Pennsylvania State University, USA
, - Geon Lee
KAIST, Republic of Korea
, - Liat Levontin
Technion, Israel
, - Cheng-Te Li
National Cheng Kung University, Taiwan
, - Haoyang Li
Tsinghua University, China
, - Ying Li
Netflix, USA
, - Jay Chiehen Liao
National Cheng Kung University, Taiwan
, - Qidong Liu
City University of Hong Kong, Hong Kong
, - Usha Lokala
University of South Carolina, USA
, - Ben London
Amazon, USA
, - Siqu Long
The University of Sydney, Australia
, - Hande Kücük Mcginty
Kansas State University, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, USA
, - Seungwhan Moon
Meta Reality Labs, USA
, - Usman Naseem
The University of Sydney, Australia
, - Pradeep Natarajan
Amazon Alexa AI, USA
, - Behrooz Omidvar-Tehrani
AWS AI Labs, USA
, - Zijie Pan
University of Connecticut, USA
, - Devesh Parekh
Netflix, USA
, - Jian Pei
Duke University, USA
, - Tiago Peixoto
Central European University, Austria
, - Steven Pemberton
CWI, Netherlands
, - Josiah Poon
The University of Sydney, Australia
, - Filip Radlinski
Google, United Kingdom
, - Federico Rossetto
University of Glasgow, United Kingdom
, - Kaushik Roy
University of South Carolina, USA
, - Aghiles Salah
Rakuten Group, Inc., France
, - Mehrnoosh Sameki
Microsoft Azure AI, USA
, - Amit Sheth
University of South Carolina, USA
, - Cogan Shimizu
Wright State University, USA
, - Kijung Shin
KAIST, Republic of Korea
, - Dongjin Song
University of Connecticut, USA
, - Julia Stoyanovich
New York University, USA
, - Dacheng Tao
The University of Sydney, Australia
, - Johanne Trippas
RMIT University, Australia
, - Quoc Truong
Amazon, Canada
, - Yu-Che Tsai
National Taiwan University, Taiwan
, - Adaku Uchendu
The Pennsylvania State University, USA
, - Bram Van Den Akker
Booking.com, Netherlands
, - Lin Wang
The Hong Kong Polytechnic University, Hong Kong
, - Minjie Wang
AWS Shanghai AI Lab, China
, - Shoujin Wang
University of Technology Sydney, Australia
, - Xin Wang
Tsinghua University, China
, - Ingmar Weber
Saarland University, Germany
, - Henry Weld
The University of Sydney, Australia
, - Lingfei Wu
Pinterest, USA
, - Da Xu
Walmart Labs, USA
, - Ethan Yifan Xu
Meta Reality Labs, USA
, - Shuyuan Xu
Rutgers University, USA
, - Bo Yang
LinkedIn, USA
, - Ke Yang
UMass Amherst, USA
, - Elad Yom-Tov
Microsoft, Israel
, - Jaemin Yoo
Carnegie Mellon University, USA
, - Zhou Yu
Columbia University, USA
, - Reza Zafarani
Syracuse University, USA
, - Hamed Zamani
University of Massachusetts Amherst, USA
, - Meike Zehlike
Zalando Research, Germany
, - Qi Zhang
University of Technology Sydney, Australia
, - Xikun Zhang
The University of Sydney, Australia
, - Yongfeng Zhang
Rutgers University, USA
, - Yu Zhang
University of Illinois at Urbana-Champaign, USA
, - Zheng Zhang
AWS Shanghai AI Lab, China
, - Liang Zhao
Emory University, USA
, - Xiangyu Zhao
City University of Hong Kong, Hong Kong
, - Wenwu Zhu
Tsinghua University, China
WWW '23 Companion: Companion Proceedings of the ACM Web Conference 2023•April 2023, pp 648-658• https://doi.org/10.1145/3543873.3587713This paper summarizes the content of the 28 tutorials that have been given at The Web Conference 2023.
- 1Citation
- 1,336
- Downloads
MetricsTotal Citations1Total Downloads1,336Last 12 Months623Last 6 weeks100
- Valeria Fionda
- research-articlePublic AccessPublished By ACMPublished By ACM
SCStory: Self-supervised and Continual Online Story Discovery
- Susik Yoon
University of Illinois at Urbana-Champaign, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, USA
, - Dongha Lee
Yonsei University, Republic of Korea
, - Jiawei Han
University of Illinois at Urbana-Champaign, USA
WWW '23: Proceedings of the ACM Web Conference 2023•April 2023, pp 1853-1864• https://doi.org/10.1145/3543507.3583507We present a framework SCStory for online story discovery, that helps people digest rapidly published news article streams in real-time without human annotations. To organize news article streams into stories, existing approaches directly encode the ...
- 2Citation
- 367
- Downloads
MetricsTotal Citations2Total Downloads367Last 12 Months175Last 6 weeks25
- Susik Yoon
- research-articlePublic AccessPublished By ACMPublished By ACM
The Effect of Metadata on Scientific Literature Tagging: A Cross-Field Cross-Model Study
- Yu Zhang
University of Illinois at Urbana-Champaign, USA
, - Bowen Jin
University of Illinois at Urbana-Champaign, USA
, - Qi Zhu
University of Illinois at Urbana-Champaign, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, USA
WWW '23: Proceedings of the ACM Web Conference 2023•April 2023, pp 1626-1637• https://doi.org/10.1145/3543507.3583354Due to the exponential growth of scientific publications on the Web, there is a pressing need to tag each paper with fine-grained topics so that researchers can track their interested fields of study rather than drowning in the whole literature. ...
- 2Citation
- 291
- Downloads
MetricsTotal Citations2Total Downloads291Last 12 Months174Last 6 weeks26
- Yu Zhang
- research-articlePublic AccessPublished By ACMPublished By ACM
Effective Seed-Guided Topic Discovery by Integrating Multiple Types of Contexts
- Yu Zhang
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Yunyi Zhang
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Martin Michalski
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Yucheng Jiang
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Yu Meng
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Jiawei Han
University of Illinois Urbana-Champaign, Urbana, IL, USA
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining•February 2023, pp 429-437• https://doi.org/10.1145/3539597.3570475Instead of mining coherent topics from a given text corpus in a completely unsupervised manner, seed-guided topic discovery methods leverage user-provided seed words to extract distinctive and coherent topics so that the mined topics can better cater to ...
- 2Citation
- 408
- Downloads
MetricsTotal Citations2Total Downloads408Last 12 Months202Last 6 weeks24- 2
Supplementary Material49_wsdm2023_zhang_effective_seed_guided_01.mp4-streaming.mp4WSDM presentation.mp4
- Yu Zhang
- research-articlePublic AccessPublished By ACMPublished By ACM
FineSum: Target-Oriented, Fine-Grained Opinion Summarization
- Suyu Ge
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Jiaxin Huang
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Yu Meng
University of Illinois Urbana-Champaign, Urbana, IL, USA
, - Jiawei Han
University of Illinois Urbana-Champaign, Urbana, IL, USA
WSDM '23: Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining•February 2023, pp 1093-1101• https://doi.org/10.1145/3539597.3570397Target-oriented opinion summarization is to profile a target by extracting user opinions from multiple related documents. Instead of simply mining opinion ratings on a target (e.g., a restaurant) or on multiple aspects (e.g., food, service) of a target, ...
- 0Citation
- 352
- Downloads
MetricsTotal Citations0Total Downloads352Last 12 Months185Last 6 weeks61- 1
Supplementary Materialvideo1305193868.mp4
- Suyu Ge
- abstractPublic AccessPublished By ACMPublished By ACM
Adapting Pretrained Representations for Text Mining
- Yu Meng
University of Illinois Urbana-Champaign, Champaign, IL, USA
, - Jiaxin Huang
University of Illinois Urbana-Champaign, Champaign, IL, USA
, - Yu Zhang
University of Illinois Urbana-Champaign, Champaign, IL, USA
, - Jiawei Han
University of Illinois Urbana-Champaign, Champaign, IL, USA
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining•August 2022, pp 4806-4807• https://doi.org/10.1145/3534678.3542607Pretrained text representations, evolving from context-free word embeddings to contextualized language models, have brought text mining into a new era: By pretraining neural models on large-scale text corpora and then adapting them to task-specific data, ...
- 0Citation
- 371
- Downloads
MetricsTotal Citations0Total Downloads371Last 12 Months122Last 6 weeks11
- Yu Meng
- research-articlePublic AccessPublished By ACMPublished By ACM
Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and Instance Generation
- Jiaxin Huang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Urbana, IL, USA
KDD '22: Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining•August 2022, pp 605-614• https://doi.org/10.1145/3534678.3539443We study the problem of few-shot Fine-grained Entity Typing (FET), where only a few annotated entity mentions with contexts are given for each entity type. Recently, prompt-based tuning has demonstrated superior performance to standard fine-tuning in few-...
- 7Citation
- 696
- Downloads
MetricsTotal Citations7Total Downloads696Last 12 Months282Last 6 weeks29
- Jiaxin Huang
- research-articlePublic AccessPublished By ACMPublished By ACM
Topic Discovery via Latent Space Clustering of Pretrained Language Model Representations
- Yu Meng
University of Illinois at Urbana-Champaign, USA
, - Yunyi Zhang
University of Illinois at Urbana-Champaign, USA
, - Jiaxin Huang
University of Illinois at Urbana-Champaign, USA
, - Yu Zhang
University of Illinois at Urbana-Champaign, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, USA
WWW '22: Proceedings of the ACM Web Conference 2022•April 2022, pp 3143-3152• https://doi.org/10.1145/3485447.3512034Topic models have been the prominent tools for automatic topic discovery from text corpora. Despite their effectiveness, topic models suffer from several limitations including the inability of modeling word ordering information in documents, the ...
- 16Citation
- 5,277
- Downloads
MetricsTotal Citations16Total Downloads5,277Last 12 Months4,118Last 6 weeks725
- Yu Meng
- research-article
Simulating online social response: a stimulus/response perspective
- Huajie Shao
University of Illinois at Urbana Champaign
, - Tarek Abdelzaher
University of Illinois at Urbana Champaign
, - Jiawei Han
University of Illinois at Urbana Champaign
, - Minhao Jiang
University of Illinois at Urbana Champaign
, - Yuning Mao
University of Illinois at Urbana Champaign
, - Yu Meng
University of Illinois at Urbana Champaign
, - Wenda Qiu
University of Illinois at Urbana Champaign
, - Dachun Sun
University of Illinois at Urbana Champaign
, - Ruijie Wang
University of Illinois at Urbana Champaign
, - Chaoqi Yang
University of Illinois at Urbana Champaign
, - Zhenzhou Yang
University of Illinois at Urbana Champaign
, - Xinyang Zhang
University of Illinois at Urbana Champaign
, - Yu Zhang
University of Illinois at Urbana Champaign
, - Sam Cohen
Rensselaer Polytechnic Institute
, - James Flamino
Rensselaer Polytechnic Institute
, - Gyorgy Korniss
Rensselaer Polytechnic Institute
, - Omar Malik
Rensselaer Polytechnic Institute
, - Aamir Mandviwalla
Rensselaer Polytechnic Institute
, - Boleslaw Szymanski
Rensselaer Polytechnic Institute
, - Lake Yin
Rensselaer Polytechnic Institute
The paper describes a methodology for simulating online social media activities that occur in response to external events. A large number of social media simulators model information diffusion on online social networks. However, information cascades do ...
- 0Citation
- 11
- Downloads
MetricsTotal Citations0Total Downloads11Last 12 Months1
- Huajie Shao
- research-articlePublic AccessPublished By ACMPublished By ACM
MotifClass: Weakly Supervised Text Classification with Higher-order Metadata Information
- Yu Zhang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Shweta Garg
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Xiusi Chen
University of California, Los Angeles, Los Angeles, CA, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Urbana, IL, USA
WSDM '22: Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining•February 2022, pp 1357-1367• https://doi.org/10.1145/3488560.3498384We study the problem of weakly supervised text classification, which aims to classify text documents into a set of pre-defined categories with category surface names only and without any annotated training document provided. Most existing classifiers ...
- 4Citation
- 568
- Downloads
MetricsTotal Citations4Total Downloads568Last 12 Months142Last 6 weeks8- 1
Supplementary MaterialWSDM22-fp128.mp4
- Yu Zhang
- abstractPublic AccessPublished By ACMPublished By ACM
On the Power of Pre-Trained Text Representations: Models and Applications in Text Mining
- Yu Meng
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Jiaxin Huang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Yu Zhang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Urbana, IL, USA
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining•August 2021, pp 4052-4053• https://doi.org/10.1145/3447548.3470810Recent years have witnessed the enormous success of text representation learning in a wide range of text mining tasks. Earlier word embedding learning approaches represent words as fixed low-dimensional vectors to capture their semantics. The word ...
- 0Citation
- 480
- Downloads
MetricsTotal Citations0Total Downloads480Last 12 Months86Last 6 weeks13
- Yu Meng
- research-articleOpen AccessPublished By ACMPublished By ACM
UCPhrase: Unsupervised Context-aware Quality Phrase Tagging
- Xiaotao Gu
University of Illinois at Urbana-Champaign, Champaign, IL, USA
, - Zihan Wang
University of California, San Diego, San Diago, CA, USA
, - Zhenyu Bi
University of California, San Diego, San Diago, CA, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, Champaign, IL, USA
, - Liyuan Liu
University of Illinois at Urbana-Champaign, Champaign, IL, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Champaign, IL, USA
, - Jingbo Shang
University of California, San Diego, San Diago, CA, USA
KDD '21: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining•August 2021, pp 478-486• https://doi.org/10.1145/3447548.3467397Identifying and understanding quality phrases from context is a fundamental task in text mining. The most challenging part of this task arguably lies in uncommon, emerging, and domain-specific phrases. The infrequent nature of these phrases ...
- 8Citation
- 840
- Downloads
MetricsTotal Citations8Total Downloads840Last 12 Months253Last 6 weeks43- 1
- Xiaotao Gu
- research-articlePublic AccessPublished By ACMPublished By ACM
Hierarchical Metadata-Aware Document Categorization under Weak Supervision
- Yu Zhang
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Xiusi Chen
University of California, Los Angeles, Los Angeles, CA, USA
, - Yu Meng
University of Illinois at Urbana-Champaign, Urbana, IL, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Urbana, IL, USA
WSDM '21: Proceedings of the 14th ACM International Conference on Web Search and Data Mining•March 2021, pp 770-778• https://doi.org/10.1145/3437963.3441730Categorizing documents into a given label hierarchy is intuitively appealing due to the ubiquity of hierarchical topic structures in massive text corpora. Although related studies have achieved satisfying performance in fully supervised hierarchical ...
- 13Citation
- 614
- Downloads
MetricsTotal Citations13Total Downloads614Last 12 Months123Last 6 weeks22- 1
Supplementary MaterialWSDM_Yu Zhang_327.mp4
- Yu Zhang
- tutorialPublic AccessPublished By ACMPublished By ACM
Embedding-Driven Multi-Dimensional Topic Mining and Text Analysis
- Yu Meng
University of Illinois at Urbana-Champaign, Champaign, IL, USA
, - Jiaxin Huang
University of Illinois at Urbana-Champaign, Champaign, IL, USA
, - Jiawei Han
University of Illinois at Urbana-Champaign, Champaign, IL, USA
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining•August 2020, pp 3573-3574• https://doi.org/10.1145/3394486.3406483People nowadays are immersed in a wealth of text data, ranging from news articles, to social media, academic publications, advertisements, and economic reports. A grand challenge of data mining is to develop effective, scalable and weakly-supervised ...
- 1Citation
- 638
- Downloads
MetricsTotal Citations1Total Downloads638Last 12 Months100Last 6 weeks18
- Yu Meng
Author Profile Pages
- Description: The Author Profile Page initially collects all the professional information known about authors from the publications record as known by the ACM bibliographic database, the Guide. Coverage of ACM publications is comprehensive from the 1950's. Coverage of other publishers generally starts in the mid 1980's. The Author Profile Page supplies a quick snapshot of an author's contribution to the field and some rudimentary measures of influence upon it. Over time, the contents of the Author Profile page may expand at the direction of the community.
Please see the following 2007 Turing Award winners' profiles as examples: - History: Disambiguation of author names is of course required for precise identification of all the works, and only those works, by a unique individual. Of equal importance to ACM, author name normalization is also one critical prerequisite to building accurate citation and download statistics. For the past several years, ACM has worked to normalize author names, expand reference capture, and gather detailed usage statistics, all intended to provide the community with a robust set of publication metrics. The Author Profile Pages reveal the first result of these efforts.
- Normalization: ACM uses normalization algorithms to weigh several types of evidence for merging and splitting names.
These include:- co-authors: if we have two names and cannot disambiguate them based on name alone, then we see if they have a co-author in common. If so, this weighs towards the two names being the same person.
- affiliations: names in common with same affiliation weighs toward the two names being the same person.
- publication title: names in common whose works are published in same journal weighs toward the two names being the same person.
- keywords: names in common whose works address the same subject matter as determined from title and keywords, weigh toward being the same person.
The more conservative the merging algorithms, the more bits of evidence are required before a merge is made, resulting in greater precision but lower recall of works for a given Author Profile. Many bibliographic records have only author initials. Many names lack affiliations. With very common family names, typical in Asia, more liberal algorithms result in mistaken merges.
Automatic normalization of author names is not exact. Hence it is clear that manual intervention based on human knowledge is required to perfect algorithmic results. ACM is meeting this challenge, continuing to work to improve the automated merges by tweaking the weighting of the evidence in light of experience.
- Bibliometrics: In 1926, Alfred Lotka formulated his power law (known as Lotka's Law) describing the frequency of publication by authors in a given field. According to this bibliometric law of scientific productivity, only a very small percentage (~6%) of authors in a field will produce more than 10 articles while the majority (perhaps 60%) will have but a single article published. With ACM's first cut at author name normalization in place, the distribution of our authors with 1, 2, 3..n publications does not match Lotka's Law precisely, but neither is the distribution curve far off. For a definition of ACM's first set of publication statistics, see Bibliometrics
- Future Direction:
The initial release of the Author Edit Screen is open to anyone in the community with an ACM account, but it is limited to personal information. An author's photograph, a Home Page URL, and an email may be added, deleted or edited. Changes are reviewed before they are made available on the live site.
ACM will expand this edit facility to accommodate more types of data and facilitate ease of community participation with appropriate safeguards. In particular, authors or members of the community will be able to indicate works in their profile that do not belong there and merge others that do belong but are currently missing.
A direct search interface for Author Profiles will be built.
An institutional view of works emerging from their faculty and researchers will be provided along with a relevant set of metrics.
It is possible, too, that the Author Profile page may evolve to allow interested authors to upload unpublished professional materials to an area available for search and free educational use, but distinct from the ACM Digital Library proper. It is hard to predict what shape such an area for user-generated content may take, but it carries interesting potential for input from the community.
Bibliometrics
The ACM DL is a comprehensive repository of publications from the entire field of computing.
It is ACM's intention to make the derivation of any publication statistics it generates clear to the user.
- Average citations per article = The total Citation Count divided by the total Publication Count.
- Citation Count = cumulative total number of times all authored works by this author were cited by other works within ACM's bibliographic database. Almost all reference lists in articles published by ACM have been captured. References lists from other publishers are less well-represented in the database. Unresolved references are not included in the Citation Count. The Citation Count is citations TO any type of work, but the references counted are only FROM journal and proceedings articles. Reference lists from books, dissertations, and technical reports have not generally been captured in the database. (Citation Counts for individual works are displayed with the individual record listed on the Author Page.)
- Publication Count = all works of any genre within the universe of ACM's bibliographic database of computing literature of which this person was an author. Works where the person has role as editor, advisor, chair, etc. are listed on the page but are not part of the Publication Count.
- Publication Years = the span from the earliest year of publication on a work by this author to the most recent year of publication of a work by this author captured within the ACM bibliographic database of computing literature (The ACM Guide to Computing Literature, also known as "the Guide".
- Available for download = the total number of works by this author whose full texts may be downloaded from an ACM full-text article server. Downloads from external full-text sources linked to from within the ACM bibliographic space are not counted as 'available for download'.
- Average downloads per article = The total number of cumulative downloads divided by the number of articles (including multimedia objects) available for download from ACM's servers.
- Downloads (cumulative) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server since the downloads were first counted in May 2003. The counts displayed are updated monthly and are therefore 0-31 days behind the current date. Robotic activity is scrubbed from the download statistics.
- Downloads (12 months) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server over the last 12-month period for which statistics are available. The counts displayed are usually 1-2 weeks behind the current date. (12-month download counts for individual works are displayed with the individual record.)
- Downloads (6 weeks) = The cumulative number of times all works by this author have been downloaded from an ACM full-text article server over the last 6-week period for which statistics are available. The counts displayed are usually 1-2 weeks behind the current date. (6-week download counts for individual works are displayed with the individual record.)
ACM Author-Izer Service
Summary Description
ACM Author-Izer is a unique service that enables ACM authors to generate and post links on both their homepage and institutional repository for visitors to download the definitive version of their articles from the ACM Digital Library at no charge.
Downloads from these sites are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to definitive version of ACM articles should reduce user confusion over article versioning.
ACM Author-Izer also extends ACM’s reputation as an innovative “Green Path” publisher, making ACM one of the first publishers of scholarly works to offer this model to its authors.
To access ACM Author-Izer, authors need to establish a free ACM web account. Should authors change institutions or sites, they can utilize the new ACM service to disable old links and re-authorize new links for free downloads from a different site.
How ACM Author-Izer Works
Authors may post ACM Author-Izer links in their own bibliographies maintained on their website and their own institution’s repository. The links take visitors to your page directly to the definitive version of individual articles inside the ACM Digital Library to download these articles for free.
The Service can be applied to all the articles you have ever published with ACM.
Depending on your previous activities within the ACM DL, you may need to take up to three steps to use ACM Author-Izer.
For authors who do not have a free ACM Web Account:
- Go to the ACM DL http://dl.acm.org/ and click SIGN UP. Once your account is established, proceed to next step.
For authors who have an ACM web account, but have not edited their ACM Author Profile page:
- Sign in to your ACM web account and go to your Author Profile page. Click "Add personal information" and add photograph, homepage address, etc. Click ADD AUTHOR INFORMATION to submit change. Once you receive email notification that your changes were accepted, you may utilize ACM Author-izer.
For authors who have an account and have already edited their Profile Page:
- Sign in to your ACM web account, go to your Author Profile page in the Digital Library, look for the ACM Author-izer link below each ACM published article, and begin the authorization process. If you have published many ACM articles, you may find a batch Authorization process useful. It is labeled: "Export as: ACM Author-Izer Service"
ACM Author-Izer also provides code snippets for authors to display download and citation statistics for each “authorized” article on their personal pages. Downloads from these pages are captured in official ACM statistics, improving the accuracy of usage and impact measurements. Consistently linking to the definitive version of ACM articles should reduce user confusion over article versioning.
Note: You still retain the right to post your author-prepared preprint versions on your home pages and in your institutional repositories with DOI pointers to the definitive version permanently maintained in the ACM Digital Library. But any download of your preprint versions will not be counted in ACM usage statistics. If you use these AUTHOR-IZER links instead, usage by visitors to your page will be recorded in the ACM Digital Library and displayed on your page.
FAQ
- Q. What is ACM Author-Izer?
A. ACM Author-Izer is a unique, link-based, self-archiving service that enables ACM authors to generate and post links on either their home page or institutional repository for visitors to download the definitive version of their articles for free.
- Q. What articles are eligible for ACM Author-Izer?
- A. ACM Author-Izer can be applied to all the articles authors have ever published with ACM. It is also available to authors who will have articles published in ACM publications in the future.
- Q. Are there any restrictions on authors to use this service?
- A. No. An author does not need to subscribe to the ACM Digital Library nor even be a member of ACM.
- Q. What are the requirements to use this service?
- A. To access ACM Author-Izer, authors need to have a free ACM web account, must have an ACM Author Profile page in the Digital Library, and must take ownership of their Author Profile page.
- Q. What is an ACM Author Profile Page?
- A. The Author Profile Page initially collects all the professional information known about authors from the publications record as known by the ACM Digital Library. The Author Profile Page supplies a quick snapshot of an author's contribution to the field and some rudimentary measures of influence upon it. Over time, the contents of the Author Profile page may expand at the direction of the community. Please visit the ACM Author Profile documentation page for more background information on these pages.
- Q. How do I find my Author Profile page and take ownership?
- A. You will need to take the following steps:
- Create a free ACM Web Account
- Sign-In to the ACM Digital Library
- Find your Author Profile Page by searching the ACM Digital Library for your name
- Find the result you authored (where your author name is a clickable link)
- Click on your name to go to the Author Profile Page
- Click the "Add Personal Information" link on the Author Profile Page
- Wait for ACM review and approval; generally less than 24 hours
- Q. Why does my photo not appear?
- A. Make sure that the image you submit is in .jpg or .gif format and that the file name does not contain special characters
- Q. What if I cannot find the Add Personal Information function on my author page?
- A. The ACM account linked to your profile page is different than the one you are logged into. Please logout and login to the account associated with your Author Profile Page.
- Q. What happens if an author changes the location of his bibliography or moves to a new institution?
- A. Should authors change institutions or sites, they can utilize ACM Author-Izer to disable old links and re-authorize new links for free downloads from a new location.
- Q. What happens if an author provides a URL that redirects to the author’s personal bibliography page?
- A. The service will not provide a free download from the ACM Digital Library. Instead the person who uses that link will simply go to the Citation Page for that article in the ACM Digital Library where the article may be accessed under the usual subscription rules.
However, if the author provides the target page URL, any link that redirects to that target page will enable a free download from the Service.
- Q. What happens if the author’s bibliography lives on a page with several aliases?
- A. Only one alias will work, whichever one is registered as the page containing the author’s bibliography. ACM has no technical solution to this problem at this time.
- Q. Why should authors use ACM Author-Izer?
- A. ACM Author-Izer lets visitors to authors’ personal home pages download articles for no charge from the ACM Digital Library. It allows authors to dynamically display real-time download and citation statistics for each “authorized” article on their personal site.
- Q. Does ACM Author-Izer provide benefits for authors?
- A. Downloads of definitive articles via Author-Izer links on the authors’ personal web page are captured in official ACM statistics to more accurately reflect usage and impact measurements.
Authors who do not use ACM Author-Izer links will not have downloads from their local, personal bibliographies counted. They do, however, retain the existing right to post author-prepared preprint versions on their home pages or institutional repositories with DOI pointers to the definitive version permanently maintained in the ACM Digital Library.
- Q. How does ACM Author-Izer benefit the computing community?
- A. ACM Author-Izer expands the visibility and dissemination of the definitive version of ACM articles. It is based on ACM’s strong belief that the computing community should have the widest possible access to the definitive versions of scholarly literature. By linking authors’ personal bibliography with the ACM Digital Library, user confusion over article versioning should be reduced over time.
In making ACM Author-Izer a free service to both authors and visitors to their websites, ACM is emphasizing its continuing commitment to the interests of its authors and to the computing community in ways that are consistent with its existing subscription-based access model.
- Q. Why can’t I find my most recent publication in my ACM Author Profile Page?
- A. There is a time delay between publication and the process which associates that publication with an Author Profile Page. Right now, that process usually takes 4-8 weeks.
- Q. How does ACM Author-Izer expand ACM’s “Green Path” Access Policies?
- A. ACM Author-Izer extends the rights and permissions that authors retain even after copyright transfer to ACM, which has been among the “greenest” publishers. ACM enables its author community to retain a wide range of rights related to copyright and reuse of materials. They include:
- Posting rights that ensure free access to their work outside the ACM Digital Library and print publications
- Rights to reuse any portion of their work in new works that they may create
- Copyright to artistic images in ACM’s graphics-oriented publications that authors may want to exploit in commercial contexts
- All patent rights, which remain with the original owner