Cloud-Scale Runtime Verification of Serverless Applications

Published: 01 November 2021


Serverless platforms aim to simplify the deployment, scaling, and management of cloud applications. Serverless applications are inherently distributed, and are executed using shortlived ephemeral processes. The use of short-lived ephemeral processes simplifies application scaling and management, but also means that existing approaches to monitoring distributed systems and detecting bugs cannot be applied to serverless applications. In this paper we propose Watchtower, a framework that enables runtime monitoring of serverless applications. Watchtower takes program properties as inputs, and can detect cases where applications violate these properties. We design Watchtower to minimize application changes, and to scale at the same rate as the application. We achieve the former by instrumenting libraries rather than application code, and the latter by structuring Watchtower as a serverless application. Once a bug is found, developers can use the Watchtower debugger to identify and address the root cause of the bug.

Supplementary Material

MP4 File (Day1_Session2_Order_3_Watchtower.mp4)
Presentation video


  • (2024)FaaSRCA: Full Lifecycle Root Cause Analysis for Serverless Applications2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00047(415-426)Online publication date: 28-Oct-2024
  • (2023)Halfmoon: Log-Optimal Fault-Tolerant Stateful Serverless ComputingProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613154(314-330)Online publication date: 23-Oct-2023
  • (2023)Rise of the Planet of Serverless Computing: A Systematic ReviewACM Transactions on Software Engineering and Methodology10.1145/357964332:5(1-61)Online publication date: 21-Jul-2023
SoCC '21: Proceedings of the ACM Symposium on Cloud Computing
November 2021
685 pages
Published: 01 November 2021

SoCC '21
SoCC '21: ACM Symposium on Cloud Computing
November 1 - 4, 2021
WA, Seattle, USA

Acceptance Rates

Overall Acceptance Rate 169 of 722 submissions, 23%


  • (2024)FaaSRCA: Full Lifecycle Root Cause Analysis for Serverless Applications2024 IEEE 35th International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE62328.2024.00047(415-426)Online publication date: 28-Oct-2024
  • (2023)Halfmoon: Log-Optimal Fault-Tolerant Stateful Serverless ComputingProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613154(314-330)Online publication date: 23-Oct-2023
  • (2023)Rise of the Planet of Serverless Computing: A Systematic ReviewACM Transactions on Software Engineering and Methodology10.1145/357964332:5(1-61)Online publication date: 21-Jul-2023
  • (2023)Executing Microservice Applications on Serverless, CorrectlyProceedings of the ACM on Programming Languages10.1145/35712067:POPL(367-395)Online publication date: 11-Jan-2023
  • (2023)Fine-Grained Performance and Cost Modeling and Optimization for FaaS ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.321478334:1(180-194)Online publication date: 1-Jan-2023
  • (2023)Comparison of Integration Coverage Criteria for Serverless Applications2023 IEEE International Conference on Service-Oriented System Engineering (SOSE)10.1109/SOSE58276.2023.00014(67-74)Online publication date: Jul-2023
  • (2023)Cloud Computing Based (Serverless computing) using Serverless architecture for Dynamic Web Hosting and cost Optimization2023 International Conference on Computer Communication and Informatics (ICCCI)10.1109/ICCCI56745.2023.10128286(1-6)Online publication date: 23-Jan-2023
  • (2023)Run-time failure detection via non-intrusive event analysis in a large-scale cloud computing platformJournal of Systems and Software10.1016/j.jss.2023.111611198:COnline publication date: 1-Apr-2023
  • (2022) Astrea: Auto-Serverless Analytics Towards Cost-Efficiency and QoS-Awareness IEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.317206933:12(3833-3849)Online publication date: 1-Dec-2022
  • (2022)Canary: Fault-Tolerant FaaS for Stateful Time-Sensitive ApplicationsSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00046(1-16)Online publication date: Nov-2022

