[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3359789.3359845acmotherconferencesArticle/Chapter ViewAbstractPublication PagesacsacConference Proceedingsconference-collections
research-article
Public Access

SecDATAVIEW: a secure big data workflow management system for heterogeneous computing environments

Published: 09 December 2019 Publication History

Abstract

Big data workflow management systems (BDWFMSs) have recently emerged as popular platforms to perform large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems were developed to address this problem, they are limited to specific structures such as Map-Reduce-style workflows and SQL queries. This paper proposes SecDATAVIEW, a BDWFMS that leverages Intel Software Guard eXtensions (SGX) and AMD Secure Encrypted Virtualization (SEV) to develop a heterogeneous trusted execution environment for workflows. SecDATAVIEW aims to (1) provide the confidentiality and integrity of code and data for workflows running on public untrusted clouds, (2) minimize the TCB size for a BDWFMS, (3) enable the trade-off between security and performance for workflows, and (4) support the execution of Java-based workflow tasks in SGX. Our experimental results show that SecDATAVIEW imposes 1.69x to 2.62x overhead on workflow execution time on SGX worker nodes, 1.04x to 1.29x overhead on SEV worker nodes, and 1.20x to 1.43x overhead on a heterogeneous setting in which both SGX and SEV worker nodes are used.

References

[1]
[n.d.]. National institute of standards, national vulnerability database. https://nvd.nist.gov/.
[2]
Secunia Advisory. 2013. Xen pv kernel decompression multiple vulnerabilities.
[3]
Ishtiaq Ahmed, Shiyong Lu, Changxin Bai, and Fahima Amin Bhuyan. 2018. Diagnosis Recommendation using Machine Learning Scientific Workflows. In Big Data Congress, 2018 IEEE International Conference on. IEEE.
[4]
AMD. 2018. Secure Encrypted Virtualization API Version 0.16. https://support.amd.com/en-us/search/tech-docs.
[5]
Ittai Anati, Shay Gueron, Simon Johnson, and Vincent Scarlata. 2013. Innovative technology for CPU based attestation and sealing. In Proceedings of the 2nd international workshop on hardware and architectural support for security and privacy, Vol. 13.
[6]
Sergei Arnautov, Bohdan Trach, Franz Gregor, Thomas Knauth, Andre Martin, Christian Priebe, Joshua Lind, Divya Muthukumaran, Dan O'keeffe, and Mark L Stillwell. 2016. SCONE: Secure Linux Containers with Intel SGX. In OSDI, Vol. 16. 689--703.
[7]
Alessandro Barenghi, Luca Breveglieri, Israel Koren, and David Naccache. 2012. Fault injection attacks on cryptographic devices: Theory, practice, and countermeasures. Proc. IEEE 100, 11 (2012), 3056--3076.
[8]
Andrew Baumann, Marcus Peinado, and Galen Hunt. 2015. Shielding applications from an untrusted cloud with haven. ACM Transactions on Computer Systems (TOCS) 33, 3 (2015), 8.
[9]
Andrew Baumann, Marcus Peinado, and Galen Hunt. 2015. VC3: Trustworthy data analytics in the cloud using SGX. In IEEE Symposium on Security and Privacy (SP), 2015. IEEE, 38--54.
[10]
Fahima Bhuyan, Shiyong Lu, Ishtiaq Ahmed, and Jia Zhang. 2017. Predicting efficacy of therapeutic services for autism spectrum disorder using scientific workflows. In 2017 IEEE International Conference on Big Data (Big Data). IEEE, 3847--3856.
[11]
inc Black Duck Software. [n.d.]. Black Duck Open Hub. https://www.openhub.net/p?query=xen&sort=relevance.
[12]
Ferdinand Brasser, Urs Müller, Alexandra Dmitrienko, Kari Kostiainen, Srdjan Capkun, and Ahmad-Reza Sadeghi. 2017. Software grand exposure: SGX cache attacks are practical. arXiv preprint arXiv:1702.07521 (2017), 33.
[13]
Stefan Brenner, Colin Wulf, David Goltzsche, Nico Weichbrodt, Matthias Lorenz, Christof Fetzer, Peter Pietzuch, and Rüdiger Kapitza. 2016. SecureKeeper: Confidential ZooKeeper using Intel SGX. In Middleware. 14.
[14]
Sven Bugiel, Stefan Nürnberger, Thomas Pöppelmann, Ahmad-Reza Sadeghi, and Thomas Schneider. 2011. AmazonIA: when elasticity snaps back. In Proceedings of the 18th ACM conference on Computer and communications security. ACM, 389--400.
[15]
Jon Crowcroft. 2018. Description of SGX-LKL by Peter Pietzuch - Imperial College London. https://www.cl.cam.ac.uk/~jac22/talks/ox-strachey-6.3.2018.pptx.
[16]
Jeffrey Dean and Sanjay Ghemawat. 2008. MapReduce: simplified data processing on large clusters. Commun. ACM 51, 1 (2008), 107--113.
[17]
Tien Tuan Anh Dinh, Prateek Saxena, Ee-Chien Chang, Beng Chin Ooi, and Chunwang Zhang. 2015. M2R: Enabling Stronger Privacy in MapReduce Computation. In USENIX Security Symposium. 447--462.
[18]
Paul D'Avilar, Jeremy D'Errico, Ken Berends, and Michael Peck. 2004. Reading Guide 3: Authenticated Encryption. (2004).
[19]
Xubo Fei and Shiyong Lu. 2010. A dataflow-based scientific workflow composition framework. IEEE Transactions on Services Computing 5, 1 (2010), 45--58.
[20]
Robert W Graves and Arben Pitarka. 2010. Broadband ground-motion simulation using a hybrid approach. Bulletin of the Seismological Society of America 100, 5A (2010), 2095--2123.
[21]
Marcus Hähnel, Weidong Cui, and Marcus Peinado. 2017. High-resolution Side Channels for Untrusted Operating Systems. In Proceedings of the 2017 USENIX Conference on Usenix Annual Technical Conference (USENIX ATC '17). USENIX Association, Berkeley, CA, USA, 299--312. http://dl.acm.org/citation.cfm?id=3154690.3154719
[22]
Hajar Hamidian, Shiyong Lu, Satyendra Rana, Farshad Fotouhi, and Hamid Soltanian-Zadeh. 2014. Adapting Medical Image Processing Tasks to a Scalable Scientific Workflow System. In 2014 IEEE World Congress on Services. IEEE, 385--392.
[23]
Ashwin Hirschi. 2007. Traveling light, the Lua way. IEEE software 24, 5 (2007).
[24]
Matthew Hoekstra, Reshma Lal, Pradeep Pappachan, Vinay Phegade, and Juan Del Cuvillo. 2013. Using innovative instructions to create trustworthy software solutions. In HASP@ ISCA. 11.
[25]
Intel. 2018. Intel Software Guard Extensions SDK (EDL). https://software.intel.com/en-us/sgx-sdk-dev-reference.
[26]
Intel. 2019. Intel Software Guard Extensions SDK (ECALL-OCALL Functions). https://software.intel.com/en-us/node/702973.
[27]
Geetha Jagannathan and Rebecca N Wright. 2005. Privacy-preserving distributed k-means clustering over arbitrarily partitioned data. In Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining. ACM, 593--599.
[28]
David Kaplan. 2016. AMD x86 Memory Encryption Technologies. USENIX Association, Austin, TX.
[29]
David Kaplan, Jeremy Powell, and Tom Woller. 2016. AMD memory encryption. White paper, Apr (2016).
[30]
Kashlev et al. 2014. A system architecture for running big data workflows in the cloud. In Proc. of the 2014 IEEE International Conference on Services Computing (SCC). IEEE, 51--58.
[31]
Andrey Kashlev and Shiyong Lu. 2014. A system architecture for running big data workflows in the cloud. In Services Computing (SCC), 2014 IEEE International Conference on. IEEE, 51--58.
[32]
Andrey Kashlev and Shiyong Lu. 2014. A system architecture for running big data workflows in the cloud. In Services Computing (SCC), 2014 IEEE International Conference on. IEEE, 51--58.
[33]
Andrey Kashlev, Shiyong Lu, and Aravind Mohan. 2017. Big Data Workflows: a Reference Architecture and the DATAVIEW System. Services Transactions on Big Data (STBD) 4, 1 (2017), 1--19.
[34]
Kostya Kortchinsky. 2009. Cloudburst: A VMware guest to host escape story. Black Hat USA (2009), 19.
[35]
Sangho Lee, Ming-Wei Shih, Prasun Gera, Taesoo Kim, Hyesoon Kim, and Marcus Peinado. 2017. Inferring fine-grained control flow inside SGX enclaves with branch shadowing. In 26th USENIX Security Symposium, USENIX Security. 16--18.
[36]
Cui Lin, Shiyong Lu, Xubo Fei, Artem Chebotko, Darshan Pai, Zhaoqiang Lai, Farshad Fotouhi, and Jing Hua. 2009. A reference architecture for scientific workflow management systems and the VIEW SOA solution. IEEE Transactions on Services Computing 2, 1 (2009), 79--92.
[37]
Xiao Liu, Dong Yuan, Gaofeng Zhang, Wenhao Li, Dahai Cao, Qiang He, Jinjun Chen, and Yun Yang. 2011. The design of cloud workflow systems. Springer Science & Business Media.
[38]
LSDS. 2018. SGX-LKL,Remote Attestation. https://github.com/lsds/sgx-lkl/wiki/Remote-Attestation-and-Remote-Control.
[39]
LSDS. 2019. The Allan Turing Institute SGX-LKL Library. https://www.turing.ac.uk/research/publications/sgx-lkl-library-os-running-java-applications-intel-sgx-enclaves.
[40]
LSDS. 2019. LSDS SGX-LKL Library. https://github.com/lsds/sgx-lkl.
[41]
Shiyong Lu and Jia Zhang. 2009. Collaborative scientific workflows. In 2009 IEEE International Conference on Web Services. IEEE, 527--534.
[42]
Frank McKeen, Ilya Alexandrovich, Alex Berenzon, Carlos V Rozas, Hisham Shafi, Vedvyas Shanbhogue, and Uday R Savagaonkar. 2013. Innovative instructions and software model for isolated execution. In HASP@ISCA. 10.
[43]
Saeid Mofrad, Fengwei Zhang, Shiyong Lu, and Weidong Shi. 2018. A Comparison Study of Intel SGX and AMD Memory Encryption Technology. In Proceedings of the 7th International Workshop on Hardware and Architectural Support for Security and Privacy (HASP '18). ACM, New York, NY, USA, Article 9, 8 pages.
[44]
Ahmad Moghimi, Gorka Irazoqui, and Thomas Eisenbarth. 2017. Cachezoom: How SGX amplifies the power of cache attacks. In International Conference on Cryptographic Hardware and Embedded Systems. Springer, 69--90.
[45]
Diego Perez-Botero, Jakub Szefer, and Ruby B Lee. 2013. Characterizing hypervisor vulnerabilities in cloud computing servers. In Proceedings of the 2013 international workshop on Security in cloud computing. ACM, 3--10.
[46]
Rafael Pires, Daniel Gavril, Pascal Felber, Emanuel Onica, and Marcelo Pasin. 2017. A lightweight MapReduce framework for secure processing with SGX. In Cluster, Cloud and Grid Computing (CCGRID), 2017 17th IEEE/ACM International Symposium on. IEEE, 1100--1107.
[47]
Rafael Pires, Marcelo Pasin, Pascal Felber, and Christof Fetzer. 2016. Secure content-based routing using Intel Software Guard Extensions. In Proceedings of the 17th International Middleware Conference. ACM, 10.
[48]
Jean-François Raymond. 2001. Traffic analysis: Protocols, attacks, design issues, and open problems. In Designing Privacy Enhancing Technologies. Springer, 10--29.
[49]
Thomas Ristenpart, Eran Tromer, Hovav Shacham, and Stefan Savage. 2009. Hey, you, get off of my cloud: exploring information leakage in third-party compute clouds. In Proceedings of the 16th ACM conference on Computer and communications security. ACM, 199--212.
[50]
Francisco Rocha and Miguel Correia. 2011. Lucy in the sky without diamonds: Stealing confidential data in the cloud. In Dependable Systems and Networks Workshops (DSN-W), 2011 IEEE/IFIP 41st International Conference on. IEEE, 129--134.
[51]
Phillip Rogaway. 2002. Authenticated-encryption with Associated-data. In Proceedings of the 9th ACM Conference on Computer and Communications Security (CCS '02). ACM, New York, NY, USA, 98--107.
[52]
Bruce Schneier. 2007. Applied cryptography: protocols, algorithms, and source code in C. john wiley & sons.
[53]
Michael Schwarz, Samuel Weiser, Daniel Gruss, Clémentine Maurice, and Stefan Mangard. 2017. Malware guard extension: Using SGX to conceal cache attacks. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 3--24.
[54]
Cloud Research Security. 2018. SGX-LKL,SCONE,Graphene-SGX-Remote Attestation status. https://github.com/lsds/sgx-lkl/issues/13.
[55]
Chia-Che Tsai, Donald E Porter, and Mona Vij. 2017. Graphene-SGX: A practical library OS for unmodified applications on SGX. In 2017 USENIX Annual Technical Conference (USENIX ATC).
[56]
Wenhao Wang, Guoxing Chen, Xiaorui Pan, Yinqian Zhang, XiaoFeng Wang, Vincent Bindschaedler, Haixu Tang, and Carl A Gunter. 2017. Leaky cauldron on the dark land: Understanding memory side-channel hazards in SGX. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 2421--2434.
[57]
Rafal Wojtczuk, Joanna Rutkowska, and Alexander Tereshkin. 2008. Xen 0wning trilogy. Invisible Things Lab (2008).
[58]
Yuan Xiao, Mengyuan Li, Sanchuan Chen, and Yinqian Zhang. 2017. Stacco: Differentially analyzing side-channel traces for detecting SSL/TLS vulnerabilities in secure enclaves. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security. ACM, 859--874.
[59]
Yuanzhong Xu, Weidong Cui, and Marcus Peinado. 2015. Controlled-channel attacks: Deterministic side channels for untrusted operating systems. In Security and Privacy (SP), 2015 IEEE Symposium on. IEEE, 640--656.
[60]
Jia Zhang, Daniel Kuc, and Shiyong Lu. 2012. Confucius: A tool supporting collaborative scientific workflow composition. IEEE Transactions on Services Computing 7, 1 (2012), 2--17.
[61]
Wenting Zheng, Ankur Dave, Jethro G Beekman, Raluca Ada Popa, Joseph E Gonzalez, and Ion Stoica. 2017. Opaque: An Oblivious and Encrypted Distributed Analytics Platform. In NSDI. 283--298.

Cited By

View all
  • (2024)SecFlow: Adaptive Security-Aware Workflow Management System in Multi-cloud EnvironmentsEnterprise Design, Operations, and Computing. EDOC 2023 Workshops10.1007/978-3-031-54712-6_17(281-297)Online publication date: 2-Mar-2024
  • (2023)Security and privacy concerns in cloud-based scientific and business workflowsFuture Generation Computer Systems10.1016/j.future.2023.05.015148:C(184-200)Online publication date: 1-Nov-2023
  • (2022)Securing Big Data Scientific Workflows via Trusted Heterogeneous EnvironmentsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.312364019:6(4187-4203)Online publication date: 1-Nov-2022
  • Show More Cited By

Index Terms

  1. SecDATAVIEW: a secure big data workflow management system for heterogeneous computing environments

    Recommendations

    Comments

    Please enable JavaScript to view thecomments powered by Disqus.

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ACSAC '19: Proceedings of the 35th Annual Computer Security Applications Conference
    December 2019
    821 pages
    ISBN:9781450376280
    DOI:10.1145/3359789
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 December 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Badges

    Author Tags

    1. AMD SEV
    2. Intel SGX
    3. big data workflow
    4. heterogeneous cloud
    5. trusted computing

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    ACSAC '19
    ACSAC '19: 2019 Annual Computer Security Applications Conference
    December 9 - 13, 2019
    Puerto Rico, San Juan, USA

    Acceptance Rates

    ACSAC '19 Paper Acceptance Rate 60 of 266 submissions, 23%;
    Overall Acceptance Rate 104 of 497 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)72
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 11 Dec 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SecFlow: Adaptive Security-Aware Workflow Management System in Multi-cloud EnvironmentsEnterprise Design, Operations, and Computing. EDOC 2023 Workshops10.1007/978-3-031-54712-6_17(281-297)Online publication date: 2-Mar-2024
    • (2023)Security and privacy concerns in cloud-based scientific and business workflowsFuture Generation Computer Systems10.1016/j.future.2023.05.015148:C(184-200)Online publication date: 1-Nov-2023
    • (2022)Securing Big Data Scientific Workflows via Trusted Heterogeneous EnvironmentsIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.312364019:6(4187-4203)Online publication date: 1-Nov-2022
    • (2020)EdgeMask: An Edge-based Privacy Preserving Service for Video Data Sharing2020 IEEE/ACM Symposium on Edge Computing (SEC)10.1109/SEC50012.2020.00056(382-387)Online publication date: Nov-2020
    • (2020)A Survey of Modern Scientific Workflow Scheduling Algorithms and Systems in the Era of Big Data2020 IEEE International Conference on Services Computing (SCC)10.1109/SCC49832.2020.00026(132-141)Online publication date: Nov-2020
    • (2020)SEED: Confidential Big Data Workflow Scheduling with Intel SGX Under Deadline Constraints2020 IEEE International Conference on Services Computing (SCC)10.1109/SCC49832.2020.00023(108-115)Online publication date: Nov-2020

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media