More Web Proxy on the site http://driver.im/

research-article

Rethinking Software Engineering in the Era of Foundation Models: A Curated Catalogue of Challenges in the Development of Trustworthy FMware

Authors:

Ahmed E. Hassan,

Gopi Krishnan Rajbahadur,

Keheliya Gallaba,

Filipe Roseiro Cogo,

Haoxiang Zhang,

Kishanthan Thangarajah,

Jiahuei (Justina) Lin,

Wali Mohammad Abdullah,

Zhen Ming (Jack) JiangAuthors Info & Claims

FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering

Pages 294 - 305

https://doi.org/10.1145/3663529.3663849

Published: 10 July 2024 Publication History

Abstract

Foundation models (FMs), such as Large Language Models (LLMs), have revolutionized software development by enabling new use cases and business models. We refer to software built using FMs as FMware. The unique properties of FMware (e.g., prompts, agents and the need for orchestration), coupled with the intrinsic limitations of FMs (e.g., hallucination) lead to a completely new set of software engineering challenges. Based on our industrial experience, we identified ten key SE4FMware challenges that have caused enterprise FMware development to be unproductive, costly, and risky. For each of those challenges, we state the path for innovation that we envision. We hope that the disclosure of the challenges will not only raise awareness but also promote deeper and further discussions, knowledge sharing, and innovative solutions.

References

[1]

[n. d.]. AutoGPT Documentation. https://docs.agpt.co/ Accessed 01-17-2024

[2]

[n. d.]. BabyAGI. https://github.com/yoheinakajima/babyagi Accessed 17-01-2024

[3]

[n. d.]. CrewAI: Framework for orchestrating role-playing, autonomous AI agents. https://github.com/joaomdmoura/crewai Accessed 17-01-2024

[4]

[n. d.]. MiniAGI - a minimal general-purpose autonomous agent based on GPT-3.5 / GPT-4. https://github.com/muellerberndt/mini-agi Accessed 17-01-2024

[5]

A. Alford. 2024. OpenAI Releases New Embedding Models and Improved GPT-4 Turbo. https://www.infoq.com/news/2024/02/openai-model-updates/

[6]

S. Amershi, A. Begel, C. Bird, R. DeLine, H. Gall, E. Kamar, N. Nagappan, B. Nushi, and T. Zimmermann. 2019. Software engineering for machine learning: a case study. In Proceedings of the 41st International Conference on Software Engineering (ICSE-SEIP ’19). IEEE Press, 291–300.

[7]

D. Amodei, C. Olah, J. Steinhardt, P. Christiano, and J. Schulman. 2016. Concrete Problems in AI Safety. arxiv:1606.06565.

[8]

Artificial Intelligence Standards Committee (C/AISC). [n. d.]. P3394: Standard for Large Language Model Agent Interface. https://standards.ieee.org/ieee/3394/11377 Accessed 02-10-2024

[9]

S.H Bach, D. Rodriguez, Y. Liu, C. Luo, and H. Shao. 2019. Snorkel drybell: A case study in deploying weak supervision at industrial scale. In Proc. of Int. Conf. on Management of Data (SIGMOD). 362–375.

[10]

Y. Bai, S. Kadavath, S. Kundu, A. Askell, and J. Kernion. 2022. Constitutional AI: Harmlessness from AI Feedback. arxiv:2212.08073.

[11]

M. Benjamin, P. Gagnon, N. Rostamzadeh, C. Pal, Y. Bengio, and A. Shee. 2019. Towards standardization of data licenses: The montreal data license. arXiv preprint arXiv:1903.12262.

[12]

S. Biegel, R. El-Khatib, L. O. V. B. Oliveira, M. Baak, and N. Aben. 2021. Active weasul: improving weak supervision with active learning. arXiv preprint arXiv:2104.14847.

[13]

R. Bommasani, D.A Hudson, E. Adeli, and Altman. 2021. On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258.

[14]

H.B. Braiek and F. Khomh. 2020. On testing machine learning programs. Journal of Systems and Software, 164 (2020), 110542. issn:0164-1212

[15]

D. Brajovic, N. Renner, V.P. Goebels, P. Wagner, and B. Fresz. 2023. Model Reporting for Certifiable AI: A Proposal from Merging EU Regulation into AI Development. arXiv preprint arXiv:2307.11525.

[16]

T. Brown, B. Mann, N. Ryder, M. Subbiah, and J.D. Kaplan. 2020. Language models are few-shot learners. Advances in neural information processing systems, 33 (2020), 1877–1901.

[17]

M. Casey. 2023. LLMS high priority for Enterprise Data Science, but concerns remain. https://snorkel.ai/poll-data-llm-high-priority-enterprise-data-science-concerns-remain Accessed 01-29-2024

[18]

J. Chakraborty, S. Majumder, and T. Menzies. 2021. Bias in machine learning software: why? how? what to do? In Proc. of ACM Joint Meeting on ESEC/FSE. ACM, 429–440.

[19]

K.K Chang, M. Cramer, S. Soni, and D. Bamman. 2023. Speak, memory: An archaeology of books known to chatgpt/gpt-4. arXiv preprint arXiv:2305.00118.

[20]

J. Chen, N. Yoshida, and H. Takada. 2023. An investigation of licensing of datasets for machine learning based on the GQM model. arXiv preprint arXiv:2303.13735.

[21]

L. Chen, M. Zaharia, and J. Zou. 2023. How is ChatGPT’s behavior changing over time? arxiv:2307.09009.

[22]

Y. Cheng, J. Chen, Q. Huang, Z. Xing, and X. Xu. 2023. Prompt Sapper: A LLM-Empowered Production Tool for Building AI Chains. ACM Trans. on Softw. Eng. and Methodology, Dec., issn:1557-7392

[23]

T. Claburn. 2023. GitHub, Microsoft, OpenAI fail to wriggle out of Copilot copyright lawsuit. https://www.theregister.com/2023/05/12/github_microsoft_openai_copilot/ Accessed 01-17-2024

[24]

D. Contractor, D. McDuff, J.K. Haines, J. Lee, and C. Hines. 2022. Behavioral use licensing for responsible AI. In Proceedings of the 2022 ACM Conference on Fairness, Accountability, and Transparency. 778–788.

[25]

B. Deiseroth, M. Deb, S. Weinbach, M. Brack, and P. Schramowski. 2023. AtMan: Understanding Transformer Predictions Through Memory Efficient Attention Manipulation. arxiv:2301.08110.

[26]

M. Deng, J. Wang, C.-P. Hsieh, Y. Wang, and H. Guo. 2022. RLPrompt: Optimizing Discrete Text Prompts with Reinforcement Learning. In Proc. of Conf. on EMNLP. ACL, 3369–3391.

[27]

Dynatrace. 2023. What is Observability? Not just logs, metrics and traces. https://www.dynatrace.com/news/blog/what-is-observability-2 Accessed: 01-31-2024

[28]

L. Edwards. 2021. The EU AI Act: a summary of its significance and scope. AI (the EU AI Act), 1 (2021).

[29]

C. Fernando, D. Banarse, H. Michalewski, S. Osindero, and T. Rocktäschel. 2023. Promptbreeder: Self-Referential Self-Improvement Via Prompt Evolution. arxiv:2309.16797.

[30]

AI Engineer Foundation. 2023. Agent Protocol. https://agentprotocol.ai Accessed: 01-31-2024

[31]

Linux Foundation. [n. d.]. SPDX Announces 3.0 Release Candidate with New Use Cases. https://www.linuxfoundation.org/press/spdx-sbom-3-release-candidate Accessed 02-07-2024

[32]

Martin Fowler. 2002. Patterns of Enterprise Application Architecture. Addison-Wesley Longman Publishing Co., Inc., USA. isbn:0321127420

Digital Library

[33]

T. Gao, A. Fisch, and D. Chen. 2021. Making Pre-trained Language Models Better Few-shot Learners. In Proc. of Annual Meeting of the ACL-IJCNLP. (ACL), 3816–3830.

[34]

I. Gim, G. Chen, S.S. Lee, N. Sarda, and A. Khandelwal. 2023. Prompt Cache: Modular Attention Reuse for Low-Latency Inference. arXiv preprint arXiv:2311.04934.

[35]

Github. [n. d.]. GitHub Next | Copilot Workspace. https://githubnext.com/projects/copilot-workspace/ Accessed 02-06-2024

[36]

Google. [n. d.]. Permissions on Android. https://developer.android.com/guide/topics/permissions/overview Accessed 02-06-2024

[37]

Grand View Research. 2023. Large Language Model Market Size, Share & Trends Analysis Report By Application (Customer Service, Content Generation), By Deployment, By Industry Vertical, By Region, And Segment Forecasts, 2024 - 2030. https://www.grandviewresearch.com/industry-analysis/large-language-model-llm-market-report

[38]

R. Grosse, J. Bae, C. Anil, N. Elhage, and A. Tamkin. 2023. Studying large language model generalization with influence functions. arXiv preprint arXiv:2308.03296.

[39]

S. Gunasekar, Y. Zhang, J. Aneja, C.C.T. Mendes, and A. Del Giorno. 2023. Textbooks Are All You Need. arXiv preprint arXiv:2306.11644.

[40]

Ł. Górski and S. Ramakrishna. 2023. Challenges in Adapting LLMs for Transparency: Complying with Art. 14 EU AI Act.

[41]

A.E. Hassan, B. Adams, F. Khomh, N. Nagappan, and T. Zimmermann. 2023. FM+SE Vision 2030. https://fmse.io/ Accessed 01-17-2024

[42]

S. Hong, M. Zhuge, J. Chen, X. Zheng, and Y. Cheng. 2023. MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework. arxiv:2308.00352.

[43]

O. Honovich, U. Shaham, S.R. Bowman, and O. Levy. 2022. Instruction Induction: From Few Examples to Natural Language Task Descriptions. arxiv:2205.10782.

[44]

C.-J. Hsieh, S. Si, F.X. Yu, and I.S. Dhillon. 2023. Automatic Engineering of Long Prompts. arxiv:2311.10117.

[45]

LangChain Inc. [n. d.]. LangSmith. https://www.langchain.com/langsmith Accessed 01-18-2024

[46]

Z. Ji, N. Lee, R. Frieske, T. Yu, and D. Su. [n. d.]. Survey of Hallucination in Natural Language Generation. ACM Comput. Surv., 55, 12 ([n. d.]), Article 248, issn:0360-0300

[47]

A.Q Jiang, A. Sablayrolles, A. Mensch, C. Bamford, and D.S. Chaplot. 2023. Mistral 7B. arXiv preprint arXiv:2310.06825.

[48]

H. Jiang, Q. Wu, C.Y. Lin, Y. Yang, and L. Qiu. 2023. Llmlingua: Compressing prompts for accelerated inference of large language models. arXiv preprint arXiv:2310.05736.

[49]

N. Kandpal, B. Lester, M. Muqeeth, A. Mascarenhas, and M. Evans. 2023. Git-Theta: a git extension for collaborative development of machine learning models. In Proc. of Int. Conf. on Machine Learning. Article 642, 12 pages.

[50]

O. Khattab, A. Singhvi, P. Maheshwari, Z. Zhang, and K. Santhanam. 2023. DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines. arXiv preprint arXiv:2310.03714.

[51]

F. Khomh, H. Li, M. Lamothe, M.A. Hamdaqa, and J. Cheng. 2023. Software Engineering for Machine Learning Applications (SEMLA) 2023. https://semla.polymtl.ca/ Accessed 01-17-2024

[52]

S. Krishnan, M.J. Franklin, K. Goldberg, and E. Wu. 2017. BoostClean: Automated Error Detection and Repair for Machine Learning. arxiv:1711.01299.

[53]

S. Krishnan, J. Wang, E. Wu, M.J. Franklin, and K. Goldberg. 2016. ActiveClean: Interactive Data Cleaning While Learning Convex Loss Models. arxiv:1601.03797.

[54]

LangChain, Inc. 2024. LangSmith Documentation. https://docs.smith.langchain.com/ Accessed: 01-31-2024

[55]

LangChain, Inc. 2024. Off-the-shelf LangChain Evaluators. https://docs.smith.langchain.com/evaluation/evaluator-implementations Accessed: 01-31-2024

[56]

Langchain Team. 2023. LANGCHAIN BLOG: OpenAI’s Bet on a Cognitive Architecture. https://blog.langchain.dev/openais-bet-on-a-cognitive-architecture/ Accessed: 2024-02-08

[57]

Y. LeCun. 2022. What is the future of AI. https://www.facebook.com/watch/live/?ref=watch_permalink&v=2219848494820560 Accessed: 2024-02-08

[58]

P. Lewis, E. Perez, A. Piktus, F. Petroni, and V. Karpukhin. 2020. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.). 33, Curran Associates, Inc., 9459–9474.

[59]

Y. Li, L. Meng, L. Chen, L. Yu, and D. Wu. 2022. Training Data Debugging for the Fairness of Machine Learning Software. In Proc. of Int. Conf. on Softw. Eng. (ICSE). 2215–2227.

[60]

Y. Liang, C. Wu, T. Song, W. Wu, and Y. Xia. 2023. TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs. arxiv:2303.16434.

[61]

N.F. Liu, K. Lin, J. Hewitt, A. Paranjape, and M. Bevilacqua. 2023. Lost in the middle: How language models use long contexts. arXiv preprint arXiv:2307.03172.

[62]

P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig. 2023. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv., 55, 9 (2023), Article 195.

[63]

Y. Liu. 2023. The importance of human-labeled data in the era of LLMs. arXiv preprint arXiv:2306.14910.

[64]

Y. Liu, D. Iter, Yi. Xu, S. Wang, R. Xu, and C. Zhu. 2023. Gpteval: Nlg evaluation using gpt-4 with better human alignment. arXiv preprint arXiv:2303.16634.

[65]

Llamaindex. [n. d.]. LlamaIndex (formerly GPT Index) is a data framework for your LLM applications. https://github.com/run-llama/llama_index Accessed 02-07-2024

[66]

S. Longpre, R. Mahari, A. Chen, N. Obeng-Marnu, and D. Sileo. 2023. The Data Provenance Initiative: A Large Scale Audit of Dataset Licensing & Attribution in AI. arXiv preprint arXiv:2310.16787.

[67]

S. Maekawa, D. Zhang, H. Kim, S. Rahman, and E. Hruschka. 2022. Low-resource interactive active labeling for fine-tuning language models. In Findings of the Assoc. for Computational Linguistics: EMNLP 2022. 3230–3242.

[68]

L. Martie, J. Rosenberg, V. Demers, G. Zhang, and O. Bhardwaj. 2023. Rapid Development of Compositional AI. In Proc. of Int. Conf. Softw. Eng: New Ideas and Emerging Results (ICSE-NIER).

[69]

Microsoft. [n. d.]. microsoft/semantic-kernel: Integrate cutting-edge LLM technology quickly and easily into your apps. https://github.com/microsoft/semantic-kernel Accessed 01-30-2024

[70]

Microsoft. 2023. Prompt flow documentation. https://microsoft.github.io/promptflow Accessed 02-06-2024

[71]

M. Mozes, X. He, B. Kleinberg, and L.D Griffin. 2023. Use of llms for illicit purposes: Threats, prevention measures, and vulnerabilities. arXiv preprint arXiv:2308.12833.

[72]

V. Murali, C. Maddila, I. Ahmad, M. Bolin, and D. Cheng. 2023. CodeCompose: A Large-Scale Industrial Deployment of AI-assisted Code Authoring. arXiv preprint arXiv:2305.12050.

[73]

A. Nguyen-Duc, B. Cabrero-Daniel, A. Przybylek, C. Arora, and D. Khanna. 2023. Generative Artificial Intelligence for Software Engineering–A Research Agenda. arXiv preprint arXiv:2310.18648.

[74]

NIST. 2021. Improving the Nation’s Cybersecurity: NIST’s Responsibilities Under the May 2021 Executive Order. https://www.nist.gov/itl/executive-order-improving-nations-cybersecurity/software-supply-chain-security-guidance-0 Accessed 02-01-2024

[75]

C. Novelli, F. Casolari, A. Rotolo, M. Taddeo, and L. Floridi. 2023. Taking AI risks seriously: a new assessment model for the AI Act. AI & SOCIETY.

[76]

L. Ouyang, J. Wu, X. Jiang, D. Almeida, and C. Wainwright. 2022. Training language models to follow instructions with human feedback. Advances in Neural Information Processing Systems, 35 (2022), 27730–27744.

[77]

S. Ouyang, J.M. Zhang, M. Harman, and M. Wang. 2023. LLM is Like a Box of Chocolates: the Non-determinism of ChatGPT in Code Generation. arXiv preprint arXiv:2308.02828.

[78]

V. Pamula. 2023. An Introduction to LLMOps: Operationalizing and Managing Large Language Models using Azure ML. https://techcommunity.microsoft.com/t5/ai-machine-learning-blog/an-introduction-to-llmops-operationalizing-and-managing-large/ba-p/3910996 Accessed: 01-31-2024

[79]

C. Parnin, G. Soares, R. Pandita, S. Gulwani, and J. Rich. 2023. Building Your Own Product Copilot: Challenges, Opportunities, and Needs. arxiv:2312.14231.

[80]

O. Parry, G.M. Kapfhammer, M. Hilton, and P. McMinn. 2022. Surveying the developer experience of flaky tests. In Proc. of Int. Conf. on Softw. Eng.: Software Engineering in Practice. 253–262.

[81]

Jenny Preece, Yvonne Rogers, and Helen Sharp. 2002. Interaction Design (1st ed.). John Wiley & Sons, Inc., USA. isbn:0471492787

[82]

Pallets Project. [n. d.]. Jinja. https://palletsprojects.com/p/jinja/ Accessed 02-06-2024

[83]

Y. Qin, S. Liang, Y. Ye, K. Zhu, and L. Yan. 2023. ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs. arxiv:2307.16789.

[84]

G.K. Rajbahadur, E. Tuck, L. Zi, D. Lin, and B. Chen. 2021. Can I use this publicly available dataset to build commercial AI software?–A Case Study on Publicly Available Image Datasets. arXiv preprint arXiv:2111.02374.

[85]

A. Ratner, S.H Bach, H. Ehrenberg, J. Fries, and S. Wu. 2017. Snorkel: Rapid training data creation with weak supervision. In Proc. of Int. Conf. on Very Large Data Bases (VLDB Endowment).

[86]

J. Robinson and Andrei Voronkov. 2001. Handbook of Automated Reasoning: Volume 1. MIT Press, Cambridge, MA, USA. isbn:0262182211

[87]

T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, and M. Lomeli. 2023. Toolformer: Language Models Can Teach Themselves to Use Tools. arxiv:2302.04761.

[88]

S. Schoch, R. Mishra, and Y. Ji. 2023. Data Selection for Fine-tuning Large Language Models Using Transferred Shapley Values. arXiv preprint arXiv:2306.10165.

[89]

W. Shi, A. Ajith, M. Xia, Y. Huang, and D. Liu. 2023. Detecting pretraining data from large language models. arXiv preprint arXiv:2310.16789.

[90]

I. Shumailov, Z. Shumaylov, Y. Zhao, Y. Gal, and N. Papernot. 2023. The curse of recursion: training on generated data makes models forget. Arxiv. Preprint posted online, 27 (2023).

[91]

H. Strobelt, A. Webson, V. Sanh, B. Hoover, and J. Beyer. 2023. Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation with Large Language Models. IEEE Trans. on Visualization and Computer Graphics, 29, 1 (2023), 1146–1156.

[92]

S. Tilga. [n. d.]. LLMs & humans: The perfect duo for data labeling. https://toloka.ai/blog/llms-and-humans-for-data-labeling/ Accessed 01-29-2024

[93]

K. Valmeekam, A. Olmo, S. Sreedharan, and S. Kambhampati. 2022. Large Language Models Still Can’t Plan (A Benchmark for LLMs on Planning and Reasoning about Change). In NeurIPS 2022 Foundation Models for Decision Making Workshop.

[94]

L. Wang, X. Zhang, H. Su, and J. Zhu. 2023. A Comprehensive Survey of Continual Learning: Theory, Method and Application. arxiv:2302.00487.

[95]

Y. Wang, Y. Kordi, S. Mishra, A. Liu, and N.A. Smith. 2023. Self-Instruct: Aligning Language Models with Self-Generated Instructions. arxiv:2212.10560.

[96]

J. Wei, X. Wang, D. Schuurmans, M. Bosma, and E. Chi. 2022. Chain-of-thought prompting elicits reasoning in large language models. Advances in Neural Info Processing Systems.

[97]

W.E. Wong, V. Debroy, A. Surampudi, H. Kim, and M.F. Siok. 2010. Recent Catastrophic Accidents: Investigating How Software was Responsible. In Int. Conf. on Secure Softw. Integration and Reliability Improvement. 14–22.

[98]

FOSSology Workgroup. 2023. FOSSology. https://www.fossology.org/ Accessed 02-07-2024

[99]

OpenDataology workgroup. 2022. OpenDataology. https://github.com/OpenDataology/ Accessed 02-07-2024

[100]

Q. Wu, G. Bansal, J. Zhang, Y. Wu, B. Li, and E. Zhu. 2023. AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. arxiv:2308.08155.

[101]

T. Wu, E. Jiang, A. Donsbach, J. Gray, and A. Molina. 2022. PromptChainer: Chaining Large Language Model Prompts through Visual Programming. In Extended Abstracts of Conf. on Human Factors in Computing Systems. ACM.

[102]

T. Wu, M. Terry, and C.J. Cai. 2022. AI Chains: Transparent and Controllable Human-AI Interaction by Chaining Large Language Model Prompts. In Proc. of Conf. on Human Factors in Computing Systems (CHI). ACM.

[103]

Z. Wu, L. Qiu, A. Ross, E. Akyürek, and B. Chen. 2023. Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks. arxiv:2307.02477.

[104]

xAI Team. 2023. PromptIDE. https://x.ai/prompt-ide/ Accessed 01-18-2024

[105]

Jiawen Xiong, Yong Shi, Boyuan Chen, Filipe R. Cogo, and Zhen Ming (Jack) Jiang. 2022. Towards build verifiability for Java-based systems. In Proceedings of the 44th International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP ’22). Association for Computing Machinery, New York, NY, USA. 297–306. isbn:9781450392266 https://doi.org/10.1145/3510457.3513050

Digital Library

[106]

H. Xu, Y. Chen, Y. Du, N. Shao, and W. Yanggang. 2022. GPS: Genetic Prompt Search for Efficient Few-Shot Learning. In Proc. of Conf. on EMNLP. ACL, 8162–8171.

[107]

Q. Xu, F. Hong, B. Li, C. Hu, and Z. Chen. 2023. On the Tool Manipulation Capability of Open-source Large Language Models. arxiv:2305.16504.

[108]

S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. Narasimhan, and Y. Cao. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629.

[109]

Q. Ye, M. Axmed, R. Pryzant, and F. Khani. 2023. Prompt Engineering a Prompt Engineer. arxiv:2311.05661.

[110]

Y. Yu, L. Kong, J. Zhang, R. Zhang, and C. Zhang. 2022. AcTune: Uncertainty-based active self-training for active fine-tuning of pretrained language models. In Proc. of Conf. of the North American Chapter of the Assoc. for Computational Linguistics: Human Language Technologies. 1422–1436.

[111]

S. Zhang, L. Dong, X. Li, S. Zhang, and X. Sun. 2023. Instruction tuning for large language models: A survey. arXiv preprint arXiv:2308.10792.

[112]

L. Zheng, L. Yin, Z. Xie, J. Huang, and C. Sun. 2023. Efficiently Programming Large Language Models using SGLang. CoRR.

[113]

C. Zhou, P. Liu, P. Xu, S. Iyer, and J. Sun. 2023. Lima: Less is more for alignment. arXiv preprint arXiv:2305.11206.

[114]

W. Zhou, Y.E. Jiang, L. Li, J. Wu, and T. Wang. 2023. Agents: An Open-source Framework for Autonomous Language Agents. arxiv:2309.07870.

[115]

Y. Zhou, A.I. Muresanu, Z. Han, K. Paster, and S. Pitis. 2023. Large Language Models Are Human-Level Prompt Engineers. arxiv:2211.01910.

[116]

X. Zhu, J. Li, Y. Liu, C. Ma, and W. Wang. 2023. A survey on model compression for large language models. arXiv preprint arXiv:2308.07633.

Index Terms

Rethinking Software Engineering in the Era of Foundation Models: A Curated Catalogue of Challenges in the Development of Trustworthy FMware
1. Software and its engineering
  1. Software creation and management

Recommendations

A Tutorial on Software Engineering for FMware
FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering

Foundation Models (FMs) like GPT-4 have given rise to FMware, FM-powered applications representing a new generation of software that is developed with new roles, assets, and paradigms. FMware has been widely adopted in both software engineering (SE) ...
Technical Brief on Software Engineering for FMware
ICSE-Companion '24: Proceedings of the 2024 IEEE/ACM 46th International Conference on Software Engineering: Companion Proceedings

Foundation Models (FM) like GPT-4 have given rise to FMware, FM-powered applications, which represent a new generation of software that is developed with new roles, assets, and paradigms. FMware has been widely adopted in both software engineering (SE) ...
Software Engineering Challenges in Game Development
ITNG '09: Proceedings of the 2009 Sixth International Conference on Information Technology: New Generations

In Software Engineering (SE), video game development is unique yet similar to other software endeavors. It is unique in that it combines the work of teams covering multiple disciplines (art, music, acting, programming, etc.), and that engaging game play ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image ACM Conferences

FSE 2024: Companion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering

July 2024

715 pages

ISBN:9798400706585

DOI:10.1145/3663529

General Chair:
Marcelo d'Amorim
North Carolina State University, USA

Copyright © 2024 Copyright is held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGSOFT: ACM Special Interest Group on Software Engineering

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

FSE '24

Sponsor:

SIGSOFT

FSE '24: 32nd ACM International Conference on the Foundations of Software Engineering

July 15 - 19, 2024

Porto de Galinhas, Brazil

Acceptance Rates

Overall Acceptance Rate 112 of 543 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
336
Total Downloads

Downloads (Last 12 months)336
Downloads (Last 6 weeks)95

Reflects downloads up to 11 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents