[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
research-article

Applying Authorship Analysis to Extremist-Group Web Forum Messages

Published: 01 September 2005 Publication History

Abstract

One major challenge facing the intelligence and security community is monitoring online media for terrorist group communications. This study addresses the online anonymity problem by applying authorship analysis to English and Arabic extremist group Web forum messages. The study evaluates the performance impact of different feature categories and techniques across both languages. To enhance writing style identification, researchers incorporated a comprehensive list of online authorship features. Additionally, they created an Arabic language model by adopting specific features and techniques, including an elongation filter and a root-clustering algorithm, to handle challenging linguistic characteristics. A series of experiments indicated a high level of efficacy in the models. Finally, the authors compare the English and Arabic language models and messages to aid the research community's understanding of the dynamics of these groups' authorship tendencies.This article is part of a special issue on Homeland Security.

References

[1]
R. Zheng, et al., "A Framework of Authorship Identification for Online Messages: Writing Style Features and Classification Techniques," to be published in J. Am. Soc. Information Science and Technology (Jasist), 2005.
[2]
J. Rudman, "The State of Authorship Attribution Studies: Some Problems and Solutions, Computers and the Humanities, vol. 31, 1998, pp. 351–365.
[3]
G.U. Yule, The Statistical Study of Literary Vocabulary, Cambridge Univ. Press, 1944.
[4]
O. De Vel, et al., "Mining E-mail Content for Author Identification Forensics," SIGMOD Record, vol. 30, no. 4, 2001, pp. 55–64.
[5]
J.W. Palmer and D.A. Griffith, "An Emerging Model of Web Site Design for Marketing," Comm. ACM, vol. 41, no. 3, 1998, pp. 44–51.
[6]
J.F. Burrows, "Word Patterns and Story Shapes: The Statistical Analysis of Narrative Style," Literary and Linguistic Computing, vol. 2, 1987, pp. 61–67.
[7]
F. Peng, et al., "Automated Authorship Attribution with Character Level Language Models," presented at the 10th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2003); http://users.cs.dal.ca/~vlado/papers/2003-EACL03-139.pdf.
[8]
E. Stamatatos N. Fakotakis and G. Kokkinakis, "Computer-Based Authorship Attribution without Lexical Measures," Computers and the Humanities, vol. 35, no. 2, 2001, pp. 193–214.
[9]
K.B. Beesley, "Arabic Finite-State Morphological Analysis and Generation," Proc. 16th Int'l Conf. Computational Linguistics (COLING 96), 1996, Morgan Kaufmann, pp. 89–94.
[10]
S.S. Al-Fedaghi and F. Al-Anzi, "A New Algorithm to Generate Arabic Root-Pattern Forms," Proc. 11th Nat'l Computer Conf., KFUPM, Saudi Arabia, 1989, pp. 391–400.
[11]
L.S. Larkey and M.E. Connell, "Arabic Information Retrieval at UMass in TREC-10," Proc. 10th Text Retrieval Conf. (TREC 2001), Nat'l Inst. of Standards and Technology, 2001.
[12]
I. Hmeidi G. Kanaan and M. Evens, "Design and Implementation of Automatic Indexing for Information Retrieval with Arabic Documents," J. Am. Soc. Information Science, vol. 48, no. 10, 1997, pp. 867–881.
[13]
A.N. De Roeck and W. Al-Fares, "A Morphologically Sensitive Clustering Algorithm for Identifying Arabic Roots," Proc. Assoc. for Computational Linguistics (ACL 00), 2000; www.informatik.uni-trier.de/~ley/db/conf/acl/acl2000.html.

Cited By

View all
  • (2024)Crossing Linguistic Barriers: Authorship Attribution in Sinhala TextsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/365562023:5(1-14)Online publication date: 10-May-2024
  • (2024)Computational techniques to counter terrorism: a systematic surveyMultimedia Tools and Applications10.1007/s11042-023-15545-083:1(1189-1214)Online publication date: 1-Jan-2024
  • (2023)Automatic IQ Estimation from Written text using Stylometry MethodsProceedings of the 2023 7th International Conference on Information System and Data Mining10.1145/3603765.3603769(56-65)Online publication date: 10-May-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image IEEE Intelligent Systems
IEEE Intelligent Systems  Volume 20, Issue 5
September 2005
94 pages

Publisher

IEEE Educational Activities Department

United States

Publication History

Published: 01 September 2005

Author Tags

  1. Web content analysis
  2. Web forum postings
  3. Web mining
  4. authorship analysis
  5. multilingual
  6. security
  7. text analysis

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 05 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Crossing Linguistic Barriers: Authorship Attribution in Sinhala TextsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/365562023:5(1-14)Online publication date: 10-May-2024
  • (2024)Computational techniques to counter terrorism: a systematic surveyMultimedia Tools and Applications10.1007/s11042-023-15545-083:1(1189-1214)Online publication date: 1-Jan-2024
  • (2023)Automatic IQ Estimation from Written text using Stylometry MethodsProceedings of the 2023 7th International Conference on Information System and Data Mining10.1145/3603765.3603769(56-65)Online publication date: 10-May-2023
  • (2023)Survey of Authorship Identification Tasks on Arabic TextsACM Transactions on Asian and Low-Resource Language Information Processing10.1145/356415622:4(1-24)Online publication date: 12-Apr-2023
  • (2023)Predicting Violent Extremism with Machine Learning: A Scoping ReviewSN Computer Science10.1007/s42979-023-02355-25:1Online publication date: 16-Nov-2023
  • (2022)Design and Implementation of a Machine Learning-Based English Intelligent Test SystemWireless Communications & Mobile Computing10.1155/2022/58753802022Online publication date: 1-Jan-2022
  • (2022)Authorship Attribution in Greek Literature Using Word AdjacenciesProceedings of the 12th Hellenic Conference on Artificial Intelligence10.1145/3549737.3549750(1-9)Online publication date: 7-Sep-2022
  • (2022)Authorship Attribution with Temporal Data in RedditProceedings of the XVIII Brazilian Symposium on Information Systems10.1145/3535511.3535515(1-8)Online publication date: 16-May-2022
  • (2021)The Design of Reciprocal Learning Between Human and Artificial IntelligenceProceedings of the ACM on Human-Computer Interaction10.1145/34795875:CSCW2(1-36)Online publication date: 18-Oct-2021
  • (2021)LDA-Transformer Model in Chinese Poetry Authorship AttributionInformation Retrieval10.1007/978-3-030-88189-4_5(59-73)Online publication date: 29-Oct-2021
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media