default search action
15th SoCC 2024: Redmond, WA, USA
- Proceedings of the 2024 ACM Symposium on Cloud Computing, SoCC 2024, Redmond, WA, USA, November 20-22, 2024. ACM 2024, ISBN 979-8-4007-1286-9
Systems Supporting Machine Learning I: Scheduling
- Qinghe Wang, Futian Wang, Xinwei Zheng:
Hops: Fine-grained heterogeneous sensing, efficient and fair Deep Learning cluster scheduling system. 1-17 - Archit Patke, Dhemath Reddy, Saurabh Jha, Haoran Qiu, Christian Pinto, Chandra Narayanaswami, Zbigniew Kalbarczyk, Ravishankar K. Iyer:
Queue Management for SLO-Oriented Large Language Model Serving. 18-35 - Ziyang Liu, Renyu Yang, Jin Ouyang, Weihan Jiang, Tianyu Ye, Menghao Zhang, Sui Huang, Jiaming Huang, Chengru Song, Di Zhang, Tianyu Wo, Chunming Hu:
Kale: Elastic GPU Scheduling for Online DL Model Training. 36-51 - Redwan Ibne Seraj Khan, Arnab K. Paul, Yue Cheng, Xun Steve Jian, Ali Reza Butt:
FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling. 52-68
Machine Learning Supporting Systems
- Xin Liu, Yuanyuan Huang, Tianyi Wang, Song Li, Weina Niu, Jun Shen, Qingguo Zhou, Xiaokang Zhou:
SQLStateGuard: Statement-Level SQL Injection Defense Based on Learning-Driven Middleware. 69-82 - Vikramank Y. Singh, Zhao Song, Balakrishnan (Murali) Narayanaswamy, Kapil Eknath Vaidya, Tim Kraska:
Vista: Machine Learning based Database Performance Troubleshooting Framework in Amazon RDS. 83-98 - Manish Shetty, Yinfang Chen, Gagan Somashekar, Minghua Ma, Yogesh Simmhan, Xuchao Zhang, Jonathan Mace, Dax Vandevoorde, Pedro Las-Casas, Shachee Mishra Gupta, Suman Nath, Chetan Bansal, Saravan Rajmohan:
Building AI Agents for Autonomous Clouds: Challenges and Design Principles. 99-110 - Jae-Seok Kim, Joonho Seo, SeonJin Hwang, Jin-Myeong Shin, Yoon-Ho Choi:
Zero-SAD: Zero-Shot Learning Using Synthetic Abnormal Data for Abnormal Behavior Detection on Private Cloud. 111-125 - Yanlei Diao, Dominik Horn, Andreas Kipf, Oleksandr Shchur, Ines Benito, Wenjian Dong, Davide Pagano, Pascal Pfeil, Vikram Nathan, Balakrishnan Narayanaswamy, Tim Kraska:
Forecasting Algorithms for Intelligent Resource Scaling: An Experimental Analysis. 126-143
Speed and Scalein Serverless
- Yuqiao Lan, Xiaohui Peng, Yifan Wang:
Snapipeline: Accelerating Snapshot Startup for FaaS Containers. 144-159 - Minghao Xie, Chen Qian, Heiner Litz:
En4S: Enabling SLOs in Serverless Storage Systems. 160-177 - Yifan Sui, Hanfei Yu, Yitao Hu, Jianxun Li, Hao Wang:
Pre-Warming is Not Enough: Accelerating Serverless Inference With Opportunistic Pre-Loading. 178-195 - Xinmin Zhang, Qiang He, Hao Fan, Song Wu:
Faascale: Scaling MicroVM Vertically for Serverless Computing with Memory Elasticity. 196-212 - Vishwanath Seshagiri, Abhinav Gupta, Vahab Jabrayilov, Avani Wildani, Kostis Kaffes:
Rethinking the Networking Stack for Serverless Environments: A Sidecar Approach. 213-222 - Marcin Copik, Alexandru Calotoiu, Gyorgy Réthy, Roman Böhringer, Rodrigo Bruno, Torsten Hoefler:
Process-as-a-Service: Unifying Elastic and Stateful Clouds with Serverless Processes. 223-242
The Elastic Cloud
- Rubaba Hasan, Timothy Zhu, Bhuvan Urgaonkar:
AutoBurst: Autoscaling Burstable Instances for Cost-effective Latency SLOs. 243-258 - Carlos Segarra, Ivan Durev, Peter R. Pietzuch:
Is It Time To Put Cold Starts In The Deep Freeze? 259-268 - Kevin Alarcón Negy, Tycho Nightingale, Hakim Weatherspoon, Zhiming Shen:
Towards Swap-Free, Continuous Ballooning for Fast, Cloud-Based Virtual Machine Migrations. 269-283 - Shiv Bhushan Tripathi, Debadatta Mishra:
PCLive: Pipelined Restoration of Application Containers for Reduced Service Downtime. 284-301 - Smita Vijayakumar, Anil Madhavapeddy, Evangelia Kalyvianaki:
Scheduling for Reduced Tail Task Latencies in Highly Utilized Datacenters. 302-321 - Vaibhav Bhosale, Ada Gavrilovska, Ketan Bhardwaj:
Krios: Scheduling Abstractions and Mechanisms for Enabling a LEO Compute Cloud. 322-340
When Things Go Wrong in the Cloud
- P. C. Sruthi, Zinan Guo, Deming Chu, Zhengyan Chen, Yongle Zhang:
Demystifying the Fight Against Complexity: A Comprehensive Study of Live Debugging Activities in Production Cloud Systems. 341-360 - Chaoyun Zhang, Randolph Yao, Si Qin, Ze Li, Shekhar Agrawal, Binit R. Mishra, Tri Tran, Minghua Ma, Qingwei Lin, Murali Chintalapati, Dongmei Zhang:
Deoxys: A Causal Inference Engine for Unhealthy Node Mitigation in Large-scale Cloud Infrastructure. 361-379 - Ziwei Huang, Mengyao Xie, Shibo Tang, Zihao Chang, Zhicheng Yao, Yungang Bao, Sa Wang:
INS: Identifying and Mitigating Performance Interference in Clouds via Interference-Sensitive Paths. 380-397 - Nathan Ng, Abel Souza, Ahmed Ali-Eldin, David E. Irwin, Don Towsley, Prashant J. Shenoy:
TailClipper: Reducing Tail Response Time of Distributed Services Through System-Wide Scheduling. 398-414
Systems Supporting Machine Learning II
- Yanning Yang, Dong Du, Haitao Song, Yubin Xia:
On-demand and Parallel Checkpoint/Restore for GPU Applications. 415-433 - Umesh Deshpande, Travis Janssen, Mudhakar Srivatsa, Swaminathan Sundararaman:
MoEsaic: Shared Mixture of Experts. 434-442 - Xiaoyang Zhao, Siran Yang, Jiamang Wang, Lansong Diao, Lin Qu, Chuan Wu:
FaPES: Enabling Efficient Elastic Scaling for Serverless Machine Learning Platforms. 443-459 - Bing-Shiun Han, Tathagata Paul, Zhenhua Liu, Anshul Gandhi:
KACE: Kernel-Aware Colocation for Efficient GPU Spatial Sharing. 460-469 - Zeyuan Zuo, Ningxin Su, Baochun Li, Teng Zhang:
Pack: Towards Communication-Efficient Homomorphic Encryption in Federated Learning. 470-486
The Green Cloud
- Qiangyu Pei, Lin Wang, Dong Zhang, Bingheng Yan, Chen Yu, Fangming Liu:
InferCool: Enhancing AI Inference Cooling through Transparent, Non-Intrusive Task Reassignment. 487-504 - Jorge Murillo, Walid A. Hanafy, David E. Irwin, Ramesh K. Sitaraman, Prashant J. Shenoy:
CDN-Shifter: Leveraging Spatial Workload Shifting to Decarbonize Content Delivery Networks. 505-521 - Prateek Sharma, Alexander Fuerst:
Accountable Carbon Footprints and Energy Profiling For Serverless Functions. 522-541 - Noman Bashir, Varun Gohil, Anagha Belavadi Subramanya, Mohammad Shahrad, David E. Irwin, Elsa Olivetti, Christina Delimitrou:
The Sunk Carbon Fallacy: Rethinking Carbon Footprint Metrics for Effective Carbon-Aware Scheduling. 542-551 - Jinghan Sun, Zibo Gong, Anup Agarwal, Shadi A. Noghabi, Ranveer Chandra, Marc Snir, Jian Huang:
Exploring the Efficiency of Renewable Energy-based Modular Data Centers at Scale. 552-569 - Rohan Basu Roy, Raghavendra Kanakagiri, Yankai Jiang, Devesh Tiwari:
The Hidden Carbon Footprint of Serverless Computing. 570-579
The Basics
- Masanori Misono, Peter Okelmann, Charalampos Mainas, Pramod Bhatotia:
uIO: Lightweight and Extensible Unikernels. 580-599 - Jonathan Zarnstorff, Lucas Lebow, Christopher Siems, Dillon Remuck, Colin Ruiz, Lewis Tseng:
Racos: Improving Erasure Coding State Machine Replication using Leaderless Consensus. 600-617 - Ziliang Lai, Fan Cui, Hua Fan, Eric Lo, Wenchao Zhou, Feifei Li:
Occam's Razor for Distributed Protocols. 618-636 - Hang Xiong, Cheng Qu, Jing Li:
VWeiST: A Scalable and Efficient Proof-of-Stake Blockchain Consensus. 637-649 - Yu-Hsun Chiang, Wei-Lin Chang, Shih-Wei Li, Jan-Ting Tu:
Securing a Multiprocessor KVM Hypervisor with Rust. 650-667 - Federico Parola, Shixiong Qi, Anvaya B. Narappa, K. K. Ramakrishnan, Fulvio Risso:
SURE: Secure Unikernels Make Serverless Computing Rapid and Efficient. 668-688
Bits on Disk
- Weiyue Zhao, Jingya Wu, Wenyan Lu, Xiaowei Li, Guihai Yan:
TianMen: a DPU-based storage network offloading structure for disaggregated datacenters. 689-703 - Yunsheng Dong, Boju Chen, Yanqi Pan, Xiangyu Zou, Wen Xia:
H2C-Dedup: Reducing I/O and GC Amplification for QLC SSDs from the Deduplication Metadata Perspective. 704-719 - Yekang Zhan, Haichuan Hu, Xiangrui Yang, Shaohua Wang, Qiang Cao, Hong Jiang, Jie Yao:
RomeFS: A CXL-SSD Aware File System Exploiting Synergy of Memory-Block Dual Paths. 720-736 - Soheil Khadirsharbiyani, Nima Elyasi, Armin Haj Aboutalebi, Chun-Yi Liu, Changho Choi, Mahmut Taylan Kandemir:
SmartGraph: A Framework for Graph Processing in Computational Storage. 737-754
In the Cloud
- Shaowen Xu, Qihang Zhou, Zhicong Zhang, Xiaoqi Jia, Donglin Liu, Heqing Huang, Haichao Du, Zhenyu Song:
ConMonitor: Lightweight Container Protection with Virtualization and VM Functions. 755-773 - Yancan Mao, Ruohang Yin, Liyuan Lei, Peng Ye, Shengfu Zou, Shizheng Tang, Yunzhe Guo, Ye Yuan, Xiaochen Yu, Bo Wan, Yunfei Gong, Changli Gao, Guanghui Zhang, Jian Shen, Rui Shi, Richard T. B. Ma:
ByteMQ: A Cloud-native Streaming Data Layer in ByteDance. 774-791 - Nishant Gupta, Iyswarya Narayanan, Shivam Handa, Sayak Chakraborti, Pankit Thapar, Baohua Shan, Ariel Rao, Yuanlai Liu, Pengyuan Wang, Yuqing Wu, Qingyi Gao, Chris Chao-Chun Cheng, Sihan You, Louis Huang, Jingyuan Fan, Kenny Yu, Kevin Lin, Tengfei Mu, Parth Malani, Haiying Wang, Trey Lu, Peter Zhang:
Dynamic Idle Resource Leasing To Safely Oversubscribe Capacity At Meta. 792-810 - Xinyu Han, Yuan Gao, Gabriel Parmer, Timothy Wood:
Byways: High-Performance, Isolated Network Functions for Multi-Tenant Cloud Servers. 811-829 - Jungeun Shin, Diana Arroyo, Asser N. Tantawi, Chen Wang, Alaa Youssef, Rakesh Nagi:
Cloud-native Workflow Scheduling using a Hybrid Priority Rule, Dynamic Resource Allocation, and Dynamic Task Partition. 830-846 - Pawissanutt Lertpongrujikorn, Hai Duc Nguyen, Mohsen Amini Salehi:
Streamlining Cloud-Native Application Development and Deployment with Robust Encapsulation. 847-865
Algorithms and Applications
- Tobias Pfandzelter, David Bermbach:
Komet: A Serverless Platform for Low-Earth Orbit Edge Services. 866-882 - Yanjie Song, Tianyuan Wu, Yuanhao Li, Guancheng Li, Yuchen Liu, Shu Yin, Wei Xue, Junchao Wang:
A Data Optimizer for Region-Aware Self-describing Files in Scientific Computing. 883-897 - Yijian Liu, Rodrigo Laigner, Yongluan Zhou:
Rethinking State Management in Actor Systems for Cloud-Native Applications. 898-914 - Xizhe Yin, Zhijia Zhao, Rajiv Gupta:
IncBoost: Scaling Incremental Graph Processing for Edge Deletions and Weight Updates. 915-932 - Shiva Jahangiri, Michael J. Carey, Johann-Christoph Freytag:
Memory Management in Complex Join Queries: A Re-evaluation Study. 933-942 - Shruti Mohanty, Vivek M. Bhasi, Myungjun Son, Mahmut Taylan Kandemir, Chita R. Das:
FAAStloop: Optimizing Loop-Based Applications for Serverless Computing. 943-960
Systems Supporting Machine Learning III: Training
- Xinwei Fu, Zhen Zhang, Haozheng Fan, Guangtai Huang, Mohammad El-Shabani, Randy Huang, Rahul Solanki, Fei Wu, Ron Diamant, Yida Wang:
Distributed Training of Large Language Models on AWS Trainium. 961-976 - Xue Li, Cheng Guo, Kun Qian, Menghao Zhang, Mengyu Yang, Mingwei Xu:
Near-Lossless Gradient Compression for Data-Parallel Distributed DNN Training. 977-994 - Diana Petrescu, Arsany Guirguis, Do Le Quoc, Javier Picorel, Rachid Guerraoui, Florin Dinu:
Accelerating Transfer Learning with Near-Data Computation on Cloud Object Stores. 995-1011 - Amey Agrawal, Sameer Reddy, Satwik Bhattamishra, Venkata Prabhakara Sarath Nookala, Vidushi Vashishth, Kexin Rong, Alexey Tumanov:
Inshrinkerator: Compressing Deep Learning Training Checkpoints via Dynamic Quantization. 1012-1031 - Ziji Shi, Jialin Li, Yang You:
ParaGAN: A Scalable Distributed Training Framework for Generative Adversarial Networks. 1032-1044
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.