[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1145/3590140.3592851acmconferencesArticle/Chapter ViewAbstractPublication PagesmiddlewareConference Proceedingsconference-collections
research-article
Public Access

Sora: A Latency Sensitive Approach for Microservice Soft Resource Adaptation

Published: 27 November 2023 Publication History

Abstract

Fast response time for modern web services that include numerous distributed and lightweight microservices becomes increasingly important due to its business impact. While hardware-only resource scaling approaches (e.g., FIRM [47] and PARSLO [40]) have been proposed to mitigate response time fluctuations on critical microservices, the re-adaptation of soft resources (e.g., threads or connections) that control the concurrency of hardware resource usage has been largely ignored. This paper shows that the soft resource adaptation of critical microservices has a significant impact on system scalability because either under- or over-allocation of soft resources can lead to inefficient usage of underlying hardware resources. We present Sora, an intelligent, fast soft resource adaptation management framework for quickly identifying and adjusting the optimal concurrency level of critical microservices to mitigate service-level objective (SLO) violations. Sora leverages online fine-grained system metrics and the propagated deadline along the critical path of request execution to quickly and accurately provide optimal concurrency setting for critical microservices. Based on six real-world bursty workload traces and two representative microservices benchmarks (Sock Shop and Social Network), our experimental results show that Sora can effectively mitigate large response time fluctuations and reduce the 99th percentile latency by up to 2.5× compared to the hardware-only scaling strategy FIRM [47] and 1.5× to the state-of-the-art concurrency-aware system scaling strategy ConScale.

References

[1]
Decomposing twitter: Adventures in service-oriented architecture. https://www.infoq.com/presentations/twitter-soa/.
[2]
Mean absolute percentage error. https://en.wikipedia.org/wiki/Mean_absolute_percentage_error.
[3]
mongodb. https://www.mongodb.com.
[4]
Neo4j:native graph database. https://github.com/neo4j/neo4j.
[5]
Tony mauro. adopting microservices at netflix: Lessons for architectural design. https://www.nginx.com/blog/microservices-at-netflix-architectural-best-practices/.
[6]
Sock shop microservice demo application. https://microservices-demo.github.io/, 2016.
[7]
Kubernetes horizontal pod auto-scaling. https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/, 2019.
[8]
Baarzi, A. F., and Kesidis, G. SHOWAR: Right-sizing and efficient scheduling of microservices. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 427--441.
[9]
Benesty, J., Chen, J., Huang, Y., and Cohen, I. Pearson correlation coefficient. In Noise reduction in speech processing. Springer, 2009, pp. 1--4.
[10]
Chiba, T., Nakazawa, R., Horii, H., Suneja, S., and Seelam, S. Confadvisor: A performance-centric configuration tuning framework for containers on kubernetes. In 2019 IEEE International Conference on Cloud Engineering (IC2E) (2019), IEEE, pp. 168--178.
[11]
Consortium, O. Rubbos: Bulletin board benchmark. http://jmob.ow2.org/rubbos.html, 2005.
[12]
container, L. Infrastructure for container projects. https://linuxcontainers.org/.
[13]
Cusack, G., Nazari, M., Goodarzy, S., Hunhoff, E., Oberai, P., Keller, E., Rozner, E., and Han, R. Escra: Event-driven, Sub-second Container Resource Allocation. In 2022 IEEE 42nd International Conference on Distributed Computing Systems (ICDCS) (July 2022), pp. 313--324.
[14]
Docker. Docker. https://www.docker.com/.
[15]
Einav, Y. Amazon found every 100ms of latency cost them 1 https://www.gigaspaces.com/blog/amazon-found-every-100ms-of-latency-cost-them-1-in-sales/.
[16]
Gan, Y., Zhang, Y., Cheng, D., Shetty, A., Rathi, P., Katarki, N., Bruno, A., Hu, J., Ritchken, B., Jackson, B., et al. An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems. In Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (2019), pp. 3--18.
[17]
Gandhi, A., Harchol-Balter, M., Raghunathan, R., and Kozuch, M. A. Autoscale: Dynamic, robust capacity management for multi-tier data centers. ACM Transactions on Computer Systems (TOCS) 30, 4 (2012), 14.
[18]
Gias, A. U., Casale, G., and Woodside, M. Atom: Model-driven autoscaling for microservices. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) (2019), IEEE, pp. 1994--2004.
[19]
golang. Go official website. https://golang.org/.
[20]
Gunasekaran, J. R., Thinakaran, P., Nachiappan, N. C., Kandemir, M. T., and Das, C. R. Fifer: Tackling resource underutilization in the serverless era. In Proceedings of the 21st International Middleware Conference (New York, NY, USA, 2020), Middleware '20, Association for Computing Machinery, p. 280--295.
[21]
Guo, X., Peng, X., Wang, H., Li, W., Jiang, H., Ding, D., Xie, T., and Su, L. Graph-based trace analysis for microservice architecture understanding and problem diagnosis. In Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2020), pp. 1387--1397.
[22]
Huang, L., and Zhu, T. Tprof: Performance profiling via structural aggregation and automated analysis of distributed systems traces. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 76--91.
[23]
Hwang, C., Kim, T., Kim, S., Shin, J., and Park, K. Elastic resource sharing for distributed deep learning. In 18th USENIX Symposium on Networked Systems Design and Implementation (NSDI 21) (2021), pp. 721--739.
[24]
Jaeger. Jaeger: open source, end-to-end distributed tracing. https://www.jaegertracing.io/.
[25]
Jindal, A., Podolskiy, V., and Gerndt, M. Performance modeling for cloud microservice applications. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering (2019), pp. 25--32.
[26]
Jolokia. Jolokia official website. https://jolokia.org/.
[27]
Jyothi, S. A., Curino, C., Menache, I., Narayanamurthy, S. M., Tumanov, A., Yaniv, J., Mavlyutov, R., Goiri, I., Krishnan, S., Kulkarni, J., et al. Morpheus: Towards automated {SLOs} for enterprise clusters. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16) (2016), pp. 117--134.
[28]
Kaldor, J., Mace, J., Bejda, M., Gao, E., Kuropatwa, W., O'Neill, J., Ong, K. W., Schaller, B., Shan, P., Viscomi, B., et al. Canopy: An end-to-end performance tracing and analysis system. In Proceedings of the 26th symposium on operating systems principles (2017), pp. 34--50.
[29]
Kannan, R. S., Subramanian, L., Raju, A., Ahn, J., Mars, J., and Tang, L. GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks. In Proceedings of the Fourteenth EuroSys Conference 2019 (New York, NY, USA, Mar. 2019), EuroSys '19, Association for Computing Machinery, pp. 1--16.
[30]
Kubernetes. Kubernetes. https://kubernetes.io/.
[31]
Kwan, A., Wong, J., Jacobsen, H.-A., and Muthusamy, V. Hyscale: Hybrid and network scaling of dockerized microservices in cloud data centres. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) (2019), pp. 80--90.
[32]
Liu, H., Zhang, J., Shan, H., Li, M., Chen, Y., He, X., and Li, X. Jcallgraph: tracing microservices in very large scale container cloud platforms. In International Conference on Cloud Computing (2019), Springer, pp. 287--302.
[33]
Liu, J., Zhang, S., Wang, Q., and Wei, J. Mitigating large response time fluctuations through fast concurrency adapting in clouds. In 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS) (2020), IEEE, pp. 368--377.
[34]
Liu, J., Zhang, S., Wang, Q., and Wei, J. Coordinating fast concurrency adapting with autoscaling for slo-oriented web applications. IEEE Transactions on Parallel and Distributed Systems 33, 12 (2022), 3349--3362.
[35]
Liu, L., Wang, H., Wang, A., Xiao, M., Cheng, Y., and Chen, S. Mind the gap: Broken promises of CPU reservations in containerized multi-tenant clouds. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 243--257.
[36]
Luo, S., Xu, H., Lu, C., Ye, K., Xu, G., Zhang, L., Ding, Y., He, J., and Xu, C. Characterizing microservice dependency and performance: Alibaba trace analysis. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 412--426.
[37]
Mahgoub, A., Wood, P., Ganesh, S., Mitra, S., Gerlach, W., Harrison, T., Meyer, F., Grama, A., Bagchi, S., and Chaterji, S. Rafiki: a middleware for parameter tuning of nosql datastores for dynamic metagenomics workloads. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (2017), pp. 28--40.
[38]
Maji, A. K., Mitra, S., Zhou, B., Bagchi, S., and Verma, A. Mitigating interference in cloud services by middleware reconfiguration. In Proceedings of the 15th International Middleware Conference (2014), pp. 277--288.
[39]
Mao, H., Alizadeh, M., Menache, I., and Kandula, S. Resource management with deep reinforcement learning. In Proceedings of the 15th ACM workshop on hot topics in networks (2016), pp. 50--56.
[40]
Mirhosseini, A., Elnikety, S., and Wenisch, T. F. Parslo: A gradient descent-based approach for near-optimal partial SLO allotment in microservices. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 442--457.
[41]
Mirhosseini, A., West, B. L., Blake, G. W., and Wenisch, T. F. Express-lane scheduling and multithreading to minimize the tail latency of microservices. In 2019 IEEE International Conference on Autonomic Computing (ICAC) (2019), IEEE, pp. 194--199.
[42]
Mittal, V., Qi, S., Bhattacharya, R., Lyu, X., Li, J., Kulkarni, S. G., Li, D., Hwang, J., Ramakrishnan, K. K., and Wood, T. Mu: An efficient, fair and responsive serverless framework for resource-constrained edge clouds. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 168--181.
[43]
Mvondo, D., Barbalace, A., Tchana, A., and Muller, G. Tell me when you are sleepy and what may wake you up! In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 562--569.
[44]
Netto, M. A., Cardonha, C., Cunha, R. L., and Assuncao, M. D. Evaluating auto-scaling strategies for cloud computing environments. In 2014 IEEE 22nd International Symposium on Modelling, Analysis & Simulation of Computer and Telecommunication Systems (2014), IEEE, pp. 187--196.
[45]
Ousterhout, A., Fried, J., Behrens, J., Belay, A., and Balakrishnan, H. Shenango: Achieving high CPU efficiency for latency-sensitive datacenter workloads. In 16th USENIX Symposium on Networked Systems Design and Implementation (NSDI 19) (2019), pp. 361--378.
[46]
Pautasso, C., Zimmermann, O., Amundsen, M., Lewis, J., and Josuttis, N. Microservices in practice, part 1: Reality check and service design. IEEE Annals of the History of Computing 34, 01 (2017), 91--98.
[47]
Qiu, H., Banerjee, S. S., Jha, S., Kalbarczyk, Z. T., and Iyer, R. K. {FIRM}: An intelligent fine-grained resource management framework for SLO-oriented microservices. In 14th {USENIX} Symposium on Operating Systems Design and Implementation (OSDI 20) (2020), pp. 805--825.
[48]
Qu, C., Calheiros, R. N., and Buyya, R. Auto-scaling web applications in clouds: A taxonomy and survey. ACM Computing Surveys (CSUR) 51, 4 (2018), 73.
[49]
Rahman, J., and Lama, P. Predicting the end-to-end tail latency of containerized microservices in the cloud. In 2019 IEEE International Conference on Cloud Engineering (IC2E) (2019), IEEE, pp. 200--210.
[50]
Ren, R., Ma, J., Sui, X., and Bao, Y. D2p: a distributed deadline propagation approach to tolerate long-tail latency in datacenters. In Proceedings of 5th Asia-Pacific Workshop on Systems (2014), pp. 1--6.
[51]
Rzadca, K., Findeisen, P., Swiderski, J., Zych, P., Broniek, P., Kusmierek, J., Nowak, P., Strack, B., Witusowski, P., Hand, S., and Wilkes, J. Autopilot: Workload autoscaling at Google. In Proceedings of the Fifteenth European Conference on Computer Systems (New York, NY, USA, Apr. 2020), EuroSys '20, Association for Computing Machinery, pp. 1--16.
[52]
Samanta, A., Jiao, L., Mühlhäuser, M., and Wang, L. Incentivizing Microservices for Online Resource Sharing in Edge Clouds. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), pp. 420--430.
[53]
Satopaa, V., Albrecht, J., Irwin, D., and Raghavan, B. Finding a" kneedle" in a haystack: Detecting knee points in system behavior. In 2011 31st international conference on distributed computing systems workshops (2011), IEEE, pp. 166--171.
[54]
Sharma, U., Shenoy, P., Sahu, S., and Shaikh, A. A cost-aware elasticity provisioning system for the cloud. In 2011 31st International Conference on Distributed Computing Systems (2011), IEEE, pp. 559--570.
[55]
Somashekar, G., and Gandhi, A. Towards optimal configuration of microservices. In Proceedings of the 1st Workshop on Machine Learning and Systems (2021), pp. 7--14.
[56]
Song, W., Xiao, Z., Chen, Q., and Luo, H. Adaptive Resource Provisioning for the Cloud Using Online Bin Packing. 2647--2660.
[57]
Spring. Spring boot overview. https://spring.io/projects/spring-boot.
[58]
Sriraman, A., and Wenisch, T. F. μtune: Auto-tuned threading for {OLDI} microservices. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18) (2018), pp. 177--194.
[59]
Tennage, P., Perera, S., Jayasinghe, M., and Jayasena, S. An analysis of holistic tail latency behaviors of java microservices. In 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS) (2019), IEEE, pp. 697--705.
[60]
Thomas, W., and Coiin, K. gnuplot homepage. http://www.gnuplot.info/, 2019.
[61]
Toslali, M., Parthasarathy, S., Oliveira, F., Huang, H., and Coskun, A. K. Iter8: Online experimentation in the cloud. In SoCC '21: ACM Symposium on Cloud Computing, Seattle, WA, USA, November 1-4, 2021, C. Curino, G. Koutrika, and R. Netravali, Eds., ACM, pp. 289--304.
[62]
Virtuozzo. Open source container-based virtualization for linux. https://openvz.org/.
[63]
VMware. Vmware esxi: The purpose-built bare metal hypervisor. https://www.vmware.com/products/esxi-and-esx.html, 2019.
[64]
Wang, Q., Chen, H., Zhang, S., Hu, L., and Palanisamy, B. Integrating concurrency control in n-tier application scaling management in the cloud. IEEE Transactions on Parallel and Distributed Systems 30, 4 (2018), 855--869.
[65]
Wang, Q., Zhang, S., Kanemasa, Y., Pu, C., Palanisamy, B., Harada, L., and Kawaba, M. Optimizing n-tier application scalability in the cloud: A study of soft resource allocation. ACM Transactions on Modeling and Performance Evaluation of Computing Systems (TOMPECS) 4, 2 (2019), 1--27.
[66]
Wang, Z., Zhu, S., Li, J., Jiang, W., Ramakrishnan, K. K., Zheng, Y., Yan, M., Zhang, X., and Liu, A. X. DeepScaling: Microservices autoscaling for stable CPU utilization in large scale cloud systems. In Proceedings of the 13th Symposium on Cloud Computing (New York, NY, USA, Nov. 2022), SoCC '22, Association for Computing Machinery, pp. 16--30.
[67]
Wu, L., Tordsson, J., Elmroth, E., and Kao, O. Microrca: Root cause localization of performance issues in microservices. In NOMS 2020-2020 IEEE/IFIP Network Operations and Management Symposium (2020), IEEE, pp. 1--9.
[68]
Xiao, Z., Song, W., and Chen, Q. Dynamic Resource Allocation Using Virtual Machines for Cloud Computing Environment. 1107--1117.
[69]
Xu, T., Jin, X., Huang, P., Zhou, Y., Lu, S., Jin, L., and Pasupathy, S. Early detection of configuration errors to reduce failure damage. In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16) (2016), pp. 619--634.
[70]
Xu, T., Zhang, J., Huang, P., Zheng, J., Sheng, T., Yuan, D., Zhou, Y., and Pasupathy, S. Do not blame users for misconfigurations. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (2013), pp. 244--259.
[71]
Yang, Z., Nguyen, P., Jin, H., and Nahrstedt, K. Miras: Model-based reinforcement learning for microservice resource allocation over scientific workflows. In 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS) (2019), IEEE, pp. 122--132.
[72]
Zhang, B., Van Aken, D., Wang, J., Dai, T., Jiang, S., Lao, J., Sheng, S., Pavlo, A., and Gordon, G. J. A demonstration of the ottertune automatic database management system tuning service. Proceedings of the VLDB Endowment 11, 12 (2018), 1910--1913.
[73]
Zhang, S., Wang, Q., Kanemasa, Y., Liu, J., and Pu, C. Doublefacead: A new data-store driver architecture to optimize fanout query performance. In Proceedings of the 21st International Middleware Conference (2020), pp. 430--444.
[74]
Zhou, X., Peng, X., Xie, T., Sun, J., Ji, C., Liu, D., Xiang, Q., and He, C. Latent error prediction and fault localization for microservice applications by learning from system trace logs. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (2019), pp. 683--694.
[75]
Zhu, Y., Liu, J., Guo, M., Bao, Y., Ma, W., Liu, Z., Song, K., and Yang, Y. Bestconfig: tapping the performance potential of systems via automatic configuration tuning. In Proceedings of the 2017 Symposium on Cloud Computing (2017), pp. 338--350.
[76]
Zhu, Z., Bi, J., Yuan, H., and Chen, Y. Sla based dynamic virtualized resources provisioning for shared cloud data centers. In 2011 IEEE 4th International Conference on Cloud Computing (2011), IEEE, pp. 630--637.
[77]
Zipkin. Zipkin: A distributed system. https://zipkin.io/.

Cited By

View all
  • (2024)Bayesian-Driven Automated Scaling in Stream Computing With Multiple QoS TargetsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.339983435:7(1251-1267)Online publication date: 13-May-2024
  • (2024)Grunt Attack: Exploiting Execution Dependencies in Microservices2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00025(115-128)Online publication date: 24-Jun-2024

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences
Middleware '23: Proceedings of the 24th International Middleware Conference
November 2023
334 pages
ISBN:9798400701771
DOI:10.1145/3590140
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

  • IFIP: International Federation for Information Processing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 November 2023

Permissions

Request permissions for this article.

Check for updates

Badges

  • Best Paper

Author Tags

  1. Auto-scaling
  2. Microservices
  3. Scalability
  4. Soft Resource

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

Middleware '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 203 of 948 submissions, 21%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)329
  • Downloads (Last 6 weeks)48
Reflects downloads up to 03 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Bayesian-Driven Automated Scaling in Stream Computing With Multiple QoS TargetsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.339983435:7(1251-1267)Online publication date: 13-May-2024
  • (2024)Grunt Attack: Exploiting Execution Dependencies in Microservices2024 54th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN58291.2024.00025(115-128)Online publication date: 24-Jun-2024

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media