[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3391403.3399488acmconferencesArticle/Chapter ViewAbstractPublication PagesecConference Proceedingsconference-collections
research-article
Public Access

Surrogate Scoring Rules

Published: 13 July 2020 Publication History

Abstract

Strictly proper scoring rules (SPSR) are incentive compatible for eliciting information about random variables from strategic agents when the principal can reward agents after the realization of the random variables. They also quantify the quality of elicited information, with more accurate predictions receiving higher scores in expectation. In this paper, we extend such scoring rules to settings where a principal elicits private probabilistic beliefs but only has access to agents' reports. We name our solution Surrogate Scoring Rules (SSR). SSR build on a bias correction step and an error rate estimation procedure for a reference answer defined using agents' reports. We show that, with a single bit of information about the prior distribution of the random variables, SSR in a multi-task setting recover SPSR in expectation, as if having access to the ground truth. Therefore, a salient feature of SSR is that they quantify the quality of information despite the lack of ground truth, just as SPSR do for the setting with ground truth. As a by-product, SSR induce dominant truthfulness in reporting. Our method is verified both theoretically and empirically using data collected from real human forecasters.

References

[1]
Dana Angluin and Philip Laird. 1988. Learning from noisy examples. Machine Learning, Vol. 2, 4 (1988), 343--370.
[2]
Pavel Atanasov, Phillip Rescober, Eric Stone, Samuel A Swift, Emile Servan-Schreiber, Philip Tetlock, Lyle Ungar, and Barbara Mellers. 2016. Distilling the wisdom of crowds: Prediction markets vs. prediction polls. Management science, Vol. 63, 3 (2016), 691--706.
[3]
Glenn W Brier. 1950. Verification of forecasts expressed in terms of probability. Monthey Weather Review, Vol. 78, 1 (1950), 1--3.
[4]
Tom Bylander. 1994. Learning linear threshold functions in the presence of classification noise. In Proceedings of the seventh annual conference on Computational learning theory. ACM, 340--347.
[5]
Anirban Dasgupta and Arpita Ghosh. 2013. Crowdsourced judgement elicitation with endogenous proficiency. In Proceedings of the 22nd international conference on World Wide Web. 319--330.
[6]
Luca De Alfaro, Michael Shavlovsky, and Vassilis Polychronopoulos. 2016. Incentives for truthful peer grading. arXiv preprint arXiv:1604.03178 (2016).
[7]
Alexander Frankel and Emir Kamenica. 2019. Quantifying information and uncertainty. American Economic Review, Vol. 109, 10 (2019), 3650--80.
[8]
Beno^it Frénay and Michel Verleysen. 2014. Classification in the presence of label noise: a survey. IEEE transactions on neural networks and learning systems, Vol. 25, 5 (2014), 845--869.
[9]
Alice Gao, James R Wright, and Kevin Leyton-Brown. 2016. Incentivizing evaluation via limited access to ground truth: Peer-prediction makes things worse. arXiv preprint arXiv:1606.07042 (2016).
[10]
Tilmann Gneiting and Adrian E. Raftery. 2007. Strictly Proper Scoring Rules, Prediction, and Estimation. J. Amer. Statist. Assoc., Vol. 102, 477 (2007), 359--378.
[11]
Naman Goel and Boi Faltings. 2018. Deep Bayesian Trust : A Dominant and Fair Incentive Mechanism for Crowd. arxiv: cs.GT/1804.05560
[12]
IARPA. 2019. Hybrid Forecasting Competition. https://www.iarpa.gov/index.php/research-programs/hfc?id=661.
[13]
Victor Richmond Jose, Robert F. Nau, and Robert L. Winkler. 2006. Scoring Rules, Generalized Entropy and utility maximization. (2006). Working Paper, Fuqua School of Business, Duke University.
[14]
Yuqing Kong. 2020. Dominantly Truthful Multi-task Peer Prediction with a Constant Number of Tasks. In Proceedings of the Fourteenth Annual ACM-SIAM Symposium on Discrete Algorithms. SIAM, 2398--2411.
[15]
Yuqing Kong and Grant Schoenebeck. 2016. Equilibrium selection in information elicitation without verification via information monotonicity. arXiv preprint arXiv:1603.07751 (2016).
[16]
Yuqing Kong and Grant Schoenebeck. 2018. Water from two rocks: Maximizing the mutual information. In Proceedings of the 2018 ACM Conference on Economics and Computation. 177--194.
[17]
Yuqing Kong and Grant Schoenebeck. 2019. An information theoretic framework for designing information elicitation mechanisms that reward truth-telling. ACM Transactions on Economics and Computation (TEAC), Vol. 7, 1 (2019), 2.
[18]
Yang Liu, Juntao Wang, and Yiling Chen. 2018. Surrogate scoring rules. arXiv preprint arXiv:1802.09158 (2018).
[19]
John McCarthy. 1956. Measures of the Value of Information. PNAS: Proceedings of the National Academy of Sciences of the United States of America, Vol. 42, 9 (1956), 654--655.
[20]
Aditya Menon, Brendan Van Rooyen, Cheng Soon Ong, and Bob Williamson. 2015. Learning from corrupted binary labels via class-probability estimation. In International Conference on Machine Learning. 125--134.
[21]
Nolan Miller, Paul Resnick, and Richard Zeckhauser. 2005. Eliciting Informative Feedback: The Peer-Prediction Method. Management Science, Vol. 51, 9 (2005), 1359 --1373.
[22]
Nagarajan Natarajan, Inderjit S Dhillon, Pradeep K Ravikumar, and Ambuj Tewari. 2013. Learning with noisy labels. In Advances in neural information processing systems. 1196--1204.
[23]
Matthew Parry et al. 2016. Linear scoring rules for probabilistic binary classification. Electronic Journal of Statistics, Vol. 10, 1 (2016), 1596--1607.
[24]
Dravzen Prelec. 2004. A Bayesian Truth Serum for Subjective Data. Science, Vol. 306, 5695 (2004), 462--466.
[25]
Dravz en Prelec, H Sebastian Seung, and John McCoy. 2017. A solution to the single-question crowd wisdom problem. Nature, Vol. 541, 7638 (2017), 532.
[26]
Goran Radanovic and Boi Faltings. 2013. A Robust Bayesian Truth Serum for Non-Binary Signals. In Proceedings of the 27th AAAI Conference on Artificial Intelligence (AAAI '13).
[27]
Goran Radanovic, Boi Faltings, and Radu Jurca. 2016. Incentives for effort in crowdsourcing using the peer truth serum. ACM Transactions on Intelligent Systems and Technology (TIST), Vol. 7, 4 (2016), 48.
[28]
Leonard J. Savage. 1971. Elicitation of Personal Probabilities and Expectations. J. Amer. Statist. Assoc., Vol. 66, 336 (1971), 783--801.
[29]
Clayton Scott. 2015. A Rate of Convergence for Mixture Proportion Estimation, with Application to Learning from Noisy Labels. In AISTATS.
[30]
Clayton Scott, Gilles Blanchard, Gregory Handy, Sara Pozzi, and Marek Flaska. 2013. Classification with Asymmetric Label Noise: Consistency and Maximal Denoising. In COLT. 489--511.
[31]
Victor Shnayder, Arpit Agarwal, Rafael Frongillo, and David C Parkes. 2016. Informed truthfulness in multi-task peer prediction. In Proceedings of the 2016 ACM Conference on Economics and Computation. ACM, 179--196.
[32]
Brendan van Rooyen and Robert C Williamson. 2015. Learning in the Presence of Corruption. arXiv preprint:1504.00091 (2015).
[33]
Robert L. Winkler. 1969. Scoring rules and the evaluation of probability assessors. J. Amer. Statist. Assoc., Vol. 64, 327 (1969), 1073--1078.
[34]
Jens Witkowski, Pavel Atanasov, Lyle H Ungar, and Andreas Krause. 2017. Proper proxy scoring rules. In Thirty-First AAAI Conference on Artificial Intelligence.
[35]
Jens Witkowski, Yoram Bachrach, Peter Key, and David C. Parkes. 2013. Dwelling on the Negative: Incentivizing Effort in Peer Prediction. In HCOMP'13.
[36]
Jens Witkowski and David C. Parkes. 2012. A Robust Bayesian Truth Serum for Small Populations. In Proceedings of the 26th AAAI Conference on Artificial Intelligence (AAAI '12).

Cited By

View all
  • (2024)Predicting the replicability of social and behavioural science claims in COVID-19 preprintsNature Human Behaviour10.1038/s41562-024-01961-1Online publication date: 20-Dec-2024
  • (2024)Bayesian herd detection for dynamic dataInternational Journal of Forecasting10.1016/j.ijforecast.2023.03.00140:1(285-301)Online publication date: Jan-2024
  • (2023)Weak proxies are sufficient and preferable for fairness with missing sensitive attributesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620230(43258-43288)Online publication date: 23-Jul-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
EC '20: Proceedings of the 21st ACM Conference on Economics and Computation
July 2020
937 pages
ISBN:9781450379755
DOI:10.1145/3391403
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 July 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dominant strategy incentive compatibility
  2. information calibration
  3. information elicitation without verification
  4. peer prediction
  5. strictly proper scoring rules

Qualifiers

  • Research-article

Funding Sources

Conference

EC '20
Sponsor:
EC '20: The 21st ACM Conference on Economics and Computation
July 13 - 17, 2020
Virtual Event, Hungary

Acceptance Rates

Overall Acceptance Rate 664 of 2,389 submissions, 28%

Upcoming Conference

EC '25
The 25th ACM Conference on Economics and Computation
July 7 - 11, 2025
Stanford , CA , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)141
  • Downloads (Last 6 weeks)30
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Predicting the replicability of social and behavioural science claims in COVID-19 preprintsNature Human Behaviour10.1038/s41562-024-01961-1Online publication date: 20-Dec-2024
  • (2024)Bayesian herd detection for dynamic dataInternational Journal of Forecasting10.1016/j.ijforecast.2023.03.00140:1(285-301)Online publication date: Jan-2024
  • (2023)Weak proxies are sufficient and preferable for fairness with missing sensitive attributesProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3620230(43258-43288)Online publication date: 23-Jul-2023
  • (2023)Decentralized justice: state of the art, recurring criticisms and next-generation research topicsFrontiers in Blockchain10.3389/fbloc.2023.12040906Online publication date: 9-Oct-2023
  • (2023)Measurement Integrity in Peer Prediction: A Peer Assessment Case StudyProceedings of the 24th ACM Conference on Economics and Computation10.1145/3580507.3597744(369-389)Online publication date: 9-Jul-2023
  • (2023)Optimal Data Acquisition with Privacy-Aware Agents2023 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML)10.1109/SaTML54575.2023.00023(210-224)Online publication date: Feb-2023
  • (2023)The Wisdom of Timely CrowdsJudgment in Predictive Analytics10.1007/978-3-031-30085-1_8(215-242)Online publication date: 3-Jun-2023
  • (2023)Talent Spotting in Crowd PredictionJudgment in Predictive Analytics10.1007/978-3-031-30085-1_6(135-184)Online publication date: 3-Jun-2023
  • (2022)Eliciting thinking hierarchy without a priorProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601239(13329-13341)Online publication date: 28-Nov-2022
  • (2022)Forecasting the publication and citation outcomes of COVID-19 preprintsRoyal Society Open Science10.1098/rsos.2204409:9Online publication date: 28-Sep-2022
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media