[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3196494.3196553acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article
Public Access

Augmenting Telephone Spam Blacklists by Mining Large CDR Datasets

Published: 29 May 2018 Publication History

Abstract

Telephone spam has become an increasingly prevalent problem in many countries all over the world. For example, the US Federal Trade Commission's (FTC) National Do Not Call Registry's number of cumulative complaints of spam/scam calls reached 30.9 million submissions in 2016. Naturally, telephone carriers can play an important role in the fight against spam. However, due to the extremely large volume of calls that transit across large carrier networks, it is challenging to mine their vast amounts of call detail records (CDRs) to accurately detect and block spam phone calls. This is because CDRs only contain high-level metadata (e.g., source and destination numbers, call start time, call duration, etc.) related to each phone calls. In addition, ground truth about both benign and spam-related phone numbers is often very scarce (only a tiny fraction of all phone numbers can be labeled). More importantly, telephone carriers are extremely sensitive to false positives, as they need to avoid blocking any non-spam calls, making the detection of spam-related numbers even more challenging. In this paper, we present a novel detection system that aims to discover telephone numbers involved in spam campaigns. Given a small seed of known spam phone numbers, our system uses a combination of unsupervised and supervised machine learning methods to mine new, previously unknown spam numbers from large datasets of call detail records (CDRs). Our objective is not to detect all possible spam phone calls crossing a carrier's network, but rather to expand the list of known spam numbers while aiming for zero false positives, so that the newly discovered numbers may be added to a phone blacklist, for example. To evaluate our system, we have conducted experiments over a large dataset of real-world CDRs provided by a leading telephony provider in China, while tuning the system to produce no false positives. The experimental results show that our system is able to greatly expand on the initial seed of known spam numbers by up to about 250%.

References

[1]
Mina Amanian, Mohammad Hossein Yaghmaee Moghaddam, and Hossein Khosravi Roshkhari . 2013. New method for evaluating anti-SPIT in VoIP networks Computer and Knowledge Engineering (ICCKE), 2013 3th International eConference on. IEEE, 374--379.
[2]
Vijay Balasubramaniyan, Mustaque Ahamad, and Haesun Park . 2007. CallRank: Combating SPIT Using Call Duration, Social Networks and Global Reputation. CEAS.
[3]
Randa Jabeur Ben Chikha, Tarek Abbes, Wassim Ben Chikha, and Adel Bouhoula . 2016. Behavior-based approach to detect spam over IP telephony attacks. International Journal of Information Security Vol. 15, 2 (2016), 131--143.
[4]
ChuBao . 2016. 2016 China Spam Phone Call Trend Analysis Report. http://www.cnii.com.cn/industry/2016-09/29/content_1784329.htm. (2016).
[5]
Federal Trade Commission . {n. d.}. Caller ID Spoofing and Call Authentication Technology. https://www.ftc.gov/sites/default/files/documents/public_events/robocalls-all-rage-ftc-summit/robocalls-part5-caller-id-spoofing.pdf. (. {n. d.}).
[6]
Federal Trade Commission . 2014. National do not call registry data book fy 2016. https://www.ftc.gov/system/files/documents/reports/national-do-not-call-registry-data-book-fiscal-year-2014/dncdatabookfy2014.pdf. (2014).
[7]
Ram Dantu and Prakash Kolan . 2005. Detecting Spam in VoIP Networks. SRUTI Vol. 5 (2005), 5--5.
[8]
Payas Gupta, Bharat Srinivasan, Vijay Balasubramaniyan, and Mustaque Ahamad . 2015. Phoneypot: Data-driven Understanding of Telephony Threats. NDSS.
[9]
Hyung-Jong Kim, Myuhng Joo Kim, Yoonjeong Kim, and Hyun Cheol Jeong . 2009. DEVS-based modeling of VoIP spam callers' behavior for SPIT level calculation. Simulation Modelling Practice and Theory Vol. 17, 4 (2009), 569--584.
[10]
Prakash Kolan and Ram Dantu . 2007. Socio-technical defense against voice spamming. ACM Transactions on Autonomous and Adaptive Systems (TAAS) Vol. 2, 1 (2007), 2.
[11]
Tetsuya Kusumoto, Eric Y Chen, and Mitsutaka Itoh . 2009. Using call patterns to detect unwanted communication callers Applications and the Internet, 2009. SAINT'09. Ninth Annual International Symposium On. IEEE, 64--70.
[12]
Jure Leskovec, Anand Rajaraman, and Jeffrey David Ullman . 2014. Mining of massive datasets. Cambridge university press.
[13]
S Pandit, R Perdisci, M Ahmad, and P Gupta . 2018. Towards Measuring the Effectiveness of Telephony Blacklists (to appear) NDSS.
[14]
Pushkar Patankar, Gunwoo Nam, George Kesidis, and Chita R Das . 2008. Exploring anti-spam models in large scale voip systems Distributed Computing Systems, 2008. ICDCS'08. The 28th International Conference on. IEEE, 85--92.
[15]
Jonathan Rosenberg and Cullen Jennings . 2008. The session initiation protocol (SIP) and spam. Technical Report.
[16]
Ming-Yang Su and Chen-Han Tsai . 2012. A prevention system for spam over internet telephony. Appl. Math Vol. 6, 2S (2012), 579S--585S.
[17]
textbf360 Security . 2017. 2016 China Mobile Security Status Report. http://zt.360.cn/1101061855.php?dtid=1101061451&did=490260073. (2017).
[18]
Kentaroh Toyoda and Iwao Sasase . 2015. Unsupervised clustering-based SPITters detection scheme. Journal of information processing Vol. 23, 1 (2015), 81--92.
[19]
Huahong Tu, Adam Doupé, Ziming Zhao, and Gail-Joon Ahn . 2016. SoK: Everyone Hates Robocalls: A Survey of Techniques against Telephone Spam Security and Privacy (SP), 2016 IEEE Symposium on. IEEE, 320--338.
[20]
Fei Wang, Min Feng, and KeXing Yan . 2012. Voice spam detecting technique based on user behavior pattern model Wireless Communications, Networking and Mobile Computing (WiCOM), 2012 8th International Conference on. IEEE, 1--5.
[21]
Fei Wang, Yijun Mo, and Benxiong Huang . 2007. P2p-avs: P2p based cooperative voip spam filtering Wireless Communications and Networking Conference, 2007. WCNC 2007. IEEE. IEEE, 3547--3552.
[22]
Wikipedia . {n. d.}. Call detail record. https://en.wikipedia.org/wiki/Call_detail_record. (. {n. d.}).
[23]
Yu-Sung Wu, Saurabh Bagchi, Navjot Singh, and Ratsameetip Wita . 2009. Spam detection in voice-over-ip calls through semi-supervised clustering Dependable Systems & Networks, 2009. DSN'09. IEEE/IFIP International Conference on. IEEE, 307--316.
[24]
Tian Zhang, Raghu Ramakrishnan, and Miron Livny . 1996. BIRCH: an efficient data clustering method for very large databases ACM Sigmod Record, Vol. Vol. 25. ACM, 103--114.

Cited By

View all
  • (2024)ROBO-SPOT: Detecting Robocalls by Understanding User Engagement and Connectivity GraphBig Data Mining and Analytics10.26599/BDMA.2023.90200207:2(340-356)Online publication date: Jun-2024
  • (2024)Jäger: Automated Telephone Call TracebackProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690290(2042-2056)Online publication date: 2-Dec-2024
  • (2023)Diving into robocall content with SnorCallProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620262(427-444)Online publication date: 9-Aug-2023
  • Show More Cited By

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
ASIACCS '18: Proceedings of the 2018 on Asia Conference on Computer and Communications Security
May 2018
866 pages
ISBN:9781450355766
DOI:10.1145/3196494
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 May 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. blacklisting
  2. cdr mining
  3. machine learning
  4. telephone spam
  5. voip

Qualifiers

  • Research-article

Funding Sources

Conference

ASIA CCS '18
Sponsor:

Acceptance Rates

ASIACCS '18 Paper Acceptance Rate 52 of 310 submissions, 17%;
Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)87
  • Downloads (Last 6 weeks)11
Reflects downloads up to 13 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)ROBO-SPOT: Detecting Robocalls by Understanding User Engagement and Connectivity GraphBig Data Mining and Analytics10.26599/BDMA.2023.90200207:2(340-356)Online publication date: Jun-2024
  • (2024)Jäger: Automated Telephone Call TracebackProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security10.1145/3658644.3690290(2042-2056)Online publication date: 2-Dec-2024
  • (2023)Diving into robocall content with SnorCallProceedings of the 32nd USENIX Conference on Security Symposium10.5555/3620237.3620262(427-444)Online publication date: 9-Aug-2023
  • (2022)Efficient Detection of Spam Over Internet Telephony by Machine Learning AlgorithmsIEEE Access10.1109/ACCESS.2022.323138410(133412-133426)Online publication date: 2022
  • (2021)An empirical study of supervised email classification in Internet of Things: Practical performance and key influencing factorsInternational Journal of Intelligent Systems10.1002/int.22625Online publication date: 17-Aug-2021
  • (2020)Fraud Detection Call Detail Record Using Machine Learning in Telecommunications CompanyAdvances in Science, Technology and Engineering Systems Journal10.25046/aj0504095:4(63-69)Online publication date: Jul-2020
  • (2020)Towards a Practical Differentially Private Collaborative Phone Blacklisting SystemProceedings of the 36th Annual Computer Security Applications Conference10.1145/3427228.3427239(100-115)Online publication date: 7-Dec-2020
  • (2019)Cooperative Fraud Detection Model With Privacy-Preserving in Real CDR DatasetsIEEE Access10.1109/ACCESS.2019.29357597(115261-115272)Online publication date: 2019

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media