[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3292522.3326042acmconferencesArticle/Chapter ViewAbstractPublication PageswebsciConference Proceedingsconference-collections
short-paper

A Reverse Turing Test for Detecting Machine-Made Texts

Published: 26 June 2019 Publication History

Abstract

As AI technologies rapidly advance, the artifacts created by machines will become prevalent. As recent incidents by the Deepfake illustrate, then, being able to differentiate man-made vs. machine-made artifacts, especially in social media space, becomes more important. In this preliminary work, in this regard, we formulate such a classification task as the Reverse Turing Test (RTT) and investigate on the contemporary status to be able to classify man-made vs. machine-made texts. Studying real-life machine-made texts in three domains of financial earning reports, research articles, and chatbot dialogues, we found that the classification of man-made vs. machine-made texts can be done at least as accurate as 0.84 in F1 score. We also found some differences between man-made and machine-made in sentiment, readability, and textual features, which can help differentiate them.

References

[1]
Elizabeth Blankespoor, Christina Zhu, and others. 2018. Capital market effects of media synthesis and dissemination: Evidence from robo-journalism. Review of Accounting Studies, Vol. 23, 1 (2018), 1--36.
[2]
Andrew Brock, Jeff Donahue, and Simonyan Karen. 2018. Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv preprint arXiv:1809.11096 (2018).
[3]
Matt Carlson. 2015. The robotic reporter: Automated journalism and the redefinition of labor, compositional forms, and journalistic authority. Digital journalism, Vol. 3, 3 (2015), 416--431.
[4]
Mehmet M Dalkilic, Wyatt T Clark, James C Costello, and Predrag Radivojac. 2006. Using compression to identify classes of inauthentic texts. In Proceedings of the 2006 SIAM International Conference on Data Mining. SIAM, 604--608.
[5]
R.F. Flesch. 1979. How to write plain English: a book for lawyers and consumers. Harper & Row. 76026225 https://books.google.com/books?id=-kpZAAAAMAAJ
[6]
Albert Gatt and Emiel Krahmer. 2018. Survey of the State of the Art in Natural Language Generation: Core tasks, applications and evaluation. Journal of Artificial Intelligence Research, Vol. 61 (2018), 65--170.
[7]
Yoav Goldberg and Omer Levy. 2014. word2vec Explained: deriving Mikolov et al.'s negative-sampling word-embedding method. arXiv preprint arXiv:1402.3722 (2014).
[8]
Jason S Kessler. 2017. Scattertext: a browser-based tool for visualizing how corpora differ. arXiv preprint arXiv:1703.00565 (2017).
[9]
Celeste Lecompte. 2015. Automation in the Newsroom. Nieman Reports, Vol. 69, 3 (2015), 32--45.
[10]
Leo Lepp"anen, Myriam Munezero, Mark Granroth-Wilding, and Hannu Toivonen. 2017. Data-Driven News Generation for Automated Journalism. In Proceedings of the 10th International Conference on Natural Language Generation. 188--197.
[11]
Yuezun Li Li, Ming-Ching Chang, and Siwei Lyu. 2018. Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv preprint arXiv:1806.02877 (2018).
[12]
Chin-Yew Lin. 2004. Rouge: A package for automatic evaluation of summaries. Text Summarization Branches Out (2004).
[13]
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics. 311--318.
[14]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, and others. 2011. Scikit-learn: Machine learning in Python. Journal of machine learning research, Vol. 12, Oct (2011), 2825--2830.
[15]
Bernhard Scholkopf and Alex Smola. 2002. Support Vector Machines and Kernel Algorithms. The Handbook of Brain Theory and Neural Networks, MA Arbib (Eds.), MIT Press.
[16]
Onur Varol, Emilio Ferrara, Clayton A Davis, Filippo Menczer, and Alessandro Flammini. 2017. Online human-bot interactions: Detection, estimation, and characterization. arXiv preprint arXiv:1703.03107 (2017).
[17]
Ramakrishna Vedantam, C Lawrence Zitnick, and Devi Parikh. 2015. Cider: Consensus-based image description evaluation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 4566--4575.

Cited By

View all
  • (2024)Bilateral turing test: Assessing machine consciousness simulationsCognitive Systems Research10.1016/j.cogsys.2024.101299(101299)Online publication date: Oct-2024
  • (2024)Decoding the AI’s Gaze: Unraveling ChatGPT’s Evaluation of Poetic CreativityHCI International 2024 Posters10.1007/978-3-031-62110-9_19(186-197)Online publication date: 1-Jun-2024
  • (2023)Attribution and Obfuscation of Neural Text Authorship: A Data Mining PerspectiveACM SIGKDD Explorations Newsletter10.1145/3606274.360627625:1(1-18)Online publication date: 22-Jun-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
WebSci '19: Proceedings of the 10th ACM Conference on Web Science
June 2019
395 pages
ISBN:9781450362023
DOI:10.1145/3292522
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. machine-made text
  2. reverse turing test
  3. supervised learning

Qualifiers

  • Short-paper

Funding Sources

  • NSF Awards
  • ORAU-directed R&D Program Award in 2018

Conference

WebSci '19
Sponsor:
WebSci '19: 11th ACM Conference on Web Science
June 30 - July 3, 2019
Massachusetts, Boston, USA

Acceptance Rates

WebSci '19 Paper Acceptance Rate 41 of 130 submissions, 32%;
Overall Acceptance Rate 245 of 933 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)34
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Bilateral turing test: Assessing machine consciousness simulationsCognitive Systems Research10.1016/j.cogsys.2024.101299(101299)Online publication date: Oct-2024
  • (2024)Decoding the AI’s Gaze: Unraveling ChatGPT’s Evaluation of Poetic CreativityHCI International 2024 Posters10.1007/978-3-031-62110-9_19(186-197)Online publication date: 1-Jun-2024
  • (2023)Attribution and Obfuscation of Neural Text Authorship: A Data Mining PerspectiveACM SIGKDD Explorations Newsletter10.1145/3606274.360627625:1(1-18)Online publication date: 22-Jun-2023
  • (2021)A Reverse Turing Like Test for Quad-copters2021 17th International Conference on Distributed Computing in Sensor Systems (DCOSS)10.1109/DCOSS52077.2021.00063(351-358)Online publication date: Jul-2021
  • (2021)Challenges in the Study of Intelligent Machines and Reverse Turing Test on Socio-Economic DecisionsDecision Economics: Minds, Machines, and their Society10.1007/978-3-030-75583-6_2(12-23)Online publication date: 17-Aug-2021

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media