Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleOctober 2024
HeckmanCD: Exploiting Selection Bias in Cognitive Diagnosis
CIKM '24: Proceedings of the 33rd ACM International Conference on Information and Knowledge ManagementPages 768–777https://doi.org/10.1145/3627673.3679648Cognitive diagnosis, a fundamental task in education assessments, aims to quantify the students' proficiency level based on the historical test logs. However, the interactions between students and exercises are incomplete and even sparse, which means ...
- research-articleOctober 2024
Research on the Acquisition and Redemption of E-coupons: A Sample Selection Bias Perspective
IMMS '24: Proceedings of the 2024 7th International Conference on Information Management and Management SciencePages 67–72https://doi.org/10.1145/3695652.3695711Research on consumers' acquisition and redemption behaviors of e-coupons can assist e-commerce enterprises in accurately placing e-coupons and improving marketing efficiency. This paper presents theoretical models for consumers' behavior of acquiring and ...
- research-articleNovember 2023Best Paper
Unbiased Top-$k$ Learning to Rank with Causal Likelihood Decomposition
SIGIR-AP '23: Proceedings of the Annual International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific RegionPages 129–138https://doi.org/10.1145/3624918.3625340Unbiased learning to rank methods have been proposed to address biases in search ranking. These biases, known as position bias and sample selection bias, often occur simultaneously in real applications. Existing approaches either tackle these biases ...
- research-articleOctober 2023
Rec4Ad: A Free Lunch to Mitigate Sample Selection Bias for Ads CTR Prediction in Taobao
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementPages 4574–4580https://doi.org/10.1145/3583780.3615496Click-Through Rate (CTR) prediction serves as a fundamental component in online advertising. A common practice is to train a CTR model on advertisement (ad) impressions with user feedback. Since ad impressions are purposely selected by the model itself, ...
- research-articleOctober 2023
Entire Space Cascade Delayed Feedback Modeling for Effective Conversion Rate Prediction
- Yunfeng Zhao,
- Xu Yan,
- Xiaoqiang Gui,
- Shuguang Han,
- Xiang-Rong Sheng,
- Guoxian Yu,
- Jufeng Chen,
- Zhao Xu,
- Bo Zheng
CIKM '23: Proceedings of the 32nd ACM International Conference on Information and Knowledge ManagementPages 4981–4987https://doi.org/10.1145/3583780.3615475Conversion rate (CVR) prediction is an essential task for e-commerce platforms. However, refunds frequently occur after conversion in online shopping systems, which drives us to pay attention to effective conversion for building healthier services. This ...
-
- research-articleAugust 2023
Causal Feature Selection in the Presence of Sample Selection Bias
ACM Transactions on Intelligent Systems and Technology (TIST), Volume 14, Issue 5Article No.: 78, Pages 1–18https://doi.org/10.1145/3604809Almost all existing causal feature selection methods are proposed without considering the problem of sample selection bias. However, in practice, as data-gathering process cannot be fully controlled, sample selection bias often occurs, leading to spurious ...
- short-paperJuly 2022
Re-weighting Negative Samples for Model-Agnostic Matching
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1823–1827https://doi.org/10.1145/3477495.3532053Recommender Systems (RS), as an efficient tool to discover users' interested items from a very large corpus, has attracted more and more attention from academia and industry. As the initial stage of RS, large-scale matching is fundamental yet ...
- short-paperJuly 2022
Training Entire-Space Models for Target-oriented Opinion Words Extraction
SIGIR '22: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information RetrievalPages 1875–1879https://doi.org/10.1145/3477495.3531768Target-oriented opinion words extraction (TOWE) is a subtask of aspect-based sentiment analysis (ABSA). Given a sentence and an aspect term occurring in the sentence, TOWE extracts the corresponding opinion words for the aspect term. TOWE has two types ...
- rapid-communicationJanuary 2022
Sample selection bias in evaluation of prediction performance of causal models
Statistical Analysis and Data Mining (STADM), Volume 15, Issue 1Pages 5–14https://doi.org/10.1002/sam.11559AbstractCausal models are notoriously difficult to validate because they make untestable assumptions regarding confounding. New scientific experiments offer the possibility of evaluating causal models using prediction performance. Prediction performance ...
- short-paperOctober 2021
Fair and Robust Classification Under Sample Selection Bias
CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge ManagementPages 2999–3003https://doi.org/10.1145/3459637.3482104To address the sample selection bias between the training and test data, previous research works focus on reweighing biased training data to match the test data and then building classification models on the reweighed training data. However, how to ...
- research-articleSeptember 2019
Synthetic minority oversampling for function approximation problems
International Journal of Intelligent Systems (IJIS), Volume 34, Issue 11Pages 2741–2768https://doi.org/10.1002/int.22120AbstractImbalanced data sets are a common occurrence in important machine learning problems. Research in improving learning under imbalanced conditions has largely focused on classification problems (ie, problems with a categorical dependent variable). ...
- short-paperJune 2018
Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate
SIGIR '18: The 41st International ACM SIGIR Conference on Research & Development in Information RetrievalPages 1137–1140https://doi.org/10.1145/3209978.3210104Estimating post-click conversion rate (CVR) accurately is crucial for ranking systems in industrial applications such as recommendation and advertising. Conventional CVR modeling applies popular deep learning methods and achieves state-of-the-art ...
- ArticleDecember 2013
Generalization of Malaria Incidence Prediction Models by Correcting Sample Selection Bias
ADMA 2013: Part II of the Proceedings of the 9th International Conference on Advanced Data Mining and Applications - Volume 8347Pages 189–200https://doi.org/10.1007/978-3-642-53917-6_17Performance measurements obtained from dividing a single sample into training and test sets, e.g. by employing cross-validation, may not give an accurate picture of the performance of any model developed from the sample, on the set of examples to which ...
- ArticleSeptember 2012
Gathering Public Concerns from Web Towards Building Corpus of Japanese Regional Concerns
IIAI-AAI '12: Proceedings of the 2012 IIAI International Conference on Advanced Applied InformaticsPages 248–253https://doi.org/10.1109/IIAI-AAI.2012.57Importance of concern assessment has been increased in Japanese regional communities. We have developed an e-Participation web platform based on a Linked Open Data set called SOCIA (Social Opinions and Concerns for Ideal Argumentation). To sophisticate ...
- research-articleJune 2009
Decision support and profit prediction for online auction sellers
U '09: Proceedings of the 1st ACM SIGKDD Workshop on Knowledge Discovery from Uncertain DataPages 1–8https://doi.org/10.1145/1610555.1610556Online auction has become a very popular e-commerce transaction type. The immense business opportunities attract a lot of individuals as well as online stores. With more sellers engaged in, the competition between sellers is more intense. For sellers, ...
- ArticleDecember 2008
Gender Wage Discrimination of Urban Working Women: Evidence from Guangzhou Microdata
ISBIM '08: Proceedings of the 2008 International Seminar on Business and Information Management - Volume 02Pages 401–404https://doi.org/10.1109/ISBIM.2008.169The paper uses micro-data of a certain industry to estimate the gender wage discrimination of domestic labor markets. With the existence of sample selection bias in the micro- data used in this paper, traditional OLS estimators are biased, and we used ...
- posterNovember 2008
Pedestrian flow prediction in extensive road networks using biased observational data
GIS '08: Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systemsArticle No.: 67, Pages 1–4https://doi.org/10.1145/1463434.1463512In this paper, we discuss an application of spatial data mining to predict pedestrian flow in extensive road networks using a large biased sample. Existing out-of-the-box techniques are not able to appropriately deal with its challenges and constraints, ...
- ArticleAugust 2007
Making generative classifiers robust to selection bias
KDD '07: Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data miningPages 657–666https://doi.org/10.1145/1281192.1281263This paper presents approaches to semi-supervised learning when the labeled training data and test data are differently distributed. Specifically, the samples selected for labeling are a biased subset of some general distribution and the test set ...
- ArticleAugust 2006
Reverse testing: an efficient framework to select amongst classifiers under sample selection bias
KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data miningPages 147–156https://doi.org/10.1145/1150402.1150422One of the most important assumptions made by many classification algorithms is that the training and test sets are drawn from the same distribution, i.e., the so-called "stationary distribution assumption" that the future and the past data sets are ...
- ArticleAugust 2004
A Bayesian network framework for reject inference
KDD '04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data miningPages 286–295https://doi.org/10.1145/1014052.1014085Most learning methods assume that the training set is drawn randomly from the population to which the learned model is to be applied. However in many applications this assumption is invalid. For example, lending institutions create models of who is ...