[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3511808.3557185acmconferencesArticle/Chapter ViewAbstractPublication PagescikmConference Proceedingsconference-collections
short-paper

System-Auditing, Data Analysis and Characteristics of Cyber Attacks for Big Data Systems

Published: 17 October 2022 Publication History

Abstract

Using big data, distributed computing systems such as Apache Hadoop requires processing massive amount of data to support business and research applications. Thus, it is critical to ensure the cyber security of such systems. To better defend from advanced cyber attacks that pose threats to even well-protected enterprises, system-auditing based techniques have been adopted for monitoring system activities and assisting attack investigation. In this demo, we are building a system that collects system auditing logs from a big data system and performs data analysis to understand how system auditing can be used more effectively to assist attack investigation on big systems. We also built a demo application that detects unexpected file deletion and presents root causes for the deletion.

Supplementary Material

MP4 File (CIKM-demo090.mp4)
Video for presentation of paper «System-Auditing, Data Analysis and Characteristics of Cyber Attacks for Big Data Systems» and it's unexpected activity detection demo.

References

[1]
2021. DepImpact Project Website. https://github.com/usenixsub/DepImpact.
[2]
2022. Apache Spark? - Unified Engine for Large-Scale Data Analytics. https://spark.apache.org/.
[3]
2022. Cloudera Data Platform (CDP). https://www.cloudera.com/products/clouderadata-platform.html.
[4]
Apache Software Foundation. 2022. Hadoop. https://hadoop.apache.org.
[5]
Ashvin Goel, Kenneth Po, Kamran Farhadi, Zheng Li, and Eyal de Lara. 2005. The Taser Intrusion Recovery System. In Proceedings of the ACM Symposium on Operating systems principles (SOSP). 163--176.
[6]
Wajih Ul Hassan, Shengjian Guo, Ding Li, Zhengzhang Chen, Kangkook Jee, Zhichun Li, and Adam Bates. 2019. NODOZE: Combatting Threat Alert Fatigue with Automated Provenance Triage. In Proceedings of the Network and Distributed System Security Symposium (NDSS).
[7]
HBase. 2022. Apache HBase. https://hbase.apache.org/.
[8]
Mohammad Akram Hossain, Arash Khalilnejad, Rojiar Haddadian, Ethan M. Pickering, Roger H. French, and Alexis R. Abramson. 2021. Data Analytics Applied to the Electricity Consumption of Office Buildings to Reveal Building Operational Characteristics. Advances in Building Energy Research 15, 6 (Nov. 2021), 755--773. https://doi.org/10.1080/17512549.2020.1730239
[9]
Y. Hu, V. Y. Gunapati, P. Zhao, D. Gordon, N. R. Wheeler, M. A. Hossain, T. J. Peshek, L. S. Bruckman, G. Zhang, and R. H. French. 2017. A Nonrelational Data Warehouse for the Analysis of Field and Laboratory Data From Multiple Heterogeneous Photovoltaic Test Sites. IEEE Journal of Photovoltaics 7, 1 (Jan. 2017), 230--236. https://doi.org/10.1109/JPHOTOV.2016.2626919
[10]
Ahmad Maroof Karimi, Yinghui Wu, Mehmet Koyuturk, and Roger H French. 2021. Spatiotemporal Graph Neural Network for Performance Prediction of Photovoltaic Power Systems. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. Association for the Advancement of Artificial Intelligence, Virtual, 8.
[11]
Arash Khalilnejad, Ahmad M. Karimi, Shreyas Kamath, Rojiar Haddadian, Roger H. French, and Alexis R. Abramson. 2020. Automated Pipeline Framework for Processing of Large-Scale Building Energy Time Series Data. PLOS ONE 15, 12 (Dec. 2020), e0240461. https://doi.org/10.1371/journal.pone.0240461
[12]
Taesoo Kim, Xi Wang, Nickolai Zeldovich, and M. Frans Kaashoek. 2010. Intrusion Recovery Using Selective Re-execution. In Proceedings of the USENIX Symposium on Operating Systems Design and Implementation (OSDI). 89--104.
[13]
Samuel T. King and Peter M. Chen. 2003. Backtracking intrusions. In Proceedings of the ACM Symposium on Operating systems principles (SOSP). ACM, 223--236.
[14]
Samuel T. King, Zhuoqing Morley Mao, Dominic G. Lucchetti, and Peter M. Chen. 2005. Enriching Intrusion Alerts Through Multi-Host Causality. In Proceedings of the Network and Distributed System Security Symposium (NDSS).
[15]
Jiqi Liu, Menghong Wang, Alan J. Curran, Erdmut Schnabel, Michael Köhl, Jennifer L. Braid, and Roger H. French. 2021. Degradation Mechanisms and Partial Shading of Glass-Backsheet and Double-Glass Photovoltaic Modules in Three Climate Zones Determined by Remote Monitoring of Time-Series Current--Voltage and Power Datastreams. Solar Energy 224 (Aug. 2021), 1291--1301. https://doi.org/10.1016/j.solener.2021.06.022
[16]
Yesudeep Mangalapilly. 2014. Watchdog: Filesystem Events Monitoring. https://github.com/gorakhargosh/watchdog.
[17]
Ruben Mayer and Hans-Arno Jacobsen. 2020. Scalable Deep Learning on Distributed Infrastructures: Challenges, Techniques, and Tools. Comput. Surveys 53, 1 (Feb. 2020), 3:1--3:37. https://doi.org/10.1145/3363554
[18]
Sysdig. 2017. Sysdig. https://sysdig.com/.
[19]
Yu Tao Tang, Ding Li, Zhi Chun Li, Mu Zhang, Kangkook Jee, Xu Sheng Xiao, Zhen Yu Wu, Junghwan Rhee, Feng Yuan Xu, and Qun Li. 2018. NodeMerge: Template Based Efficient Data Reduction For Big-Data Causality Analysis. In Proceedings of the ACM Conference on Computer and Communications Security (CCS). 1324--1337.
[20]
Zhiqiang Xu, Pengcheng Fang, Changlin Liu Liu, Xusheng Xiao, Yu Wen, and Dan Meng. 2021. DEPCOMM: Graph Summarization on System Audit Logs for Attack Investigation. In Proceedings of the IEEE Symposium on Security and Privacy (SP), San Francisco, CA. 22--26.
[21]
Zhang Xu, Zhenyu Wu, Zhichun Li, Kangkook Jee, Junghwan Rhee, Xusheng Xiao, Fengyuan Xu, Haining Wang, and Guofei Jiang. 2016. High Fidelity Data Reduction for Big Data Security Dependency Analyses. In Proceedings of the ACM Conference on Computer and Communications Security (CCS). 504--516.

Cited By

View all
  • (2023)Auditing of hadoop log file for dynamic detection of threats using H-ISSM-MIM and convolutional neural networkJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23357945:4(6617-6628)Online publication date: 4-Oct-2023
  • (2023)Accelerating Time to Science using CRADLE: A Framework for Materials Data Science2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00041(234-245)Online publication date: 18-Dec-2023

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
CIKM '22: Proceedings of the 31st ACM International Conference on Information & Knowledge Management
October 2022
5274 pages
ISBN:9781450392365
DOI:10.1145/3511808
  • General Chairs:
  • Mohammad Al Hasan,
  • Li Xiong
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. big data systems
  2. cyber attack investigation
  3. system auditing

Qualifiers

  • Short-paper

Funding Sources

  • Natural Science Foundation
  • DOE-NNSA-LLNS

Conference

CIKM '22
Sponsor:

Acceptance Rates

CIKM '22 Paper Acceptance Rate 621 of 2,257 submissions, 28%;
Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)45
  • Downloads (Last 6 weeks)11
Reflects downloads up to 17 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Auditing of hadoop log file for dynamic detection of threats using H-ISSM-MIM and convolutional neural networkJournal of Intelligent & Fuzzy Systems10.3233/JIFS-23357945:4(6617-6628)Online publication date: 4-Oct-2023
  • (2023)Accelerating Time to Science using CRADLE: A Framework for Materials Data Science2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00041(234-245)Online publication date: 18-Dec-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media