More Web Proxy on the site http://driver.im/

Article

PromptFusion: A Low-Cost Prompt-Based Task Composition for Multi-task Learning

Authors:

Xiaoguang XueAuthors Info & Claims

Neural Information Processing: 29th International Conference, ICONIP 2022, Virtual Event, November 22–26, 2022, Proceedings, Part I

Pages 492 - 503

https://doi.org/10.1007/978-3-031-30105-6_41

Published: 13 April 2023 Publication History

Abstract

Prompt-tuning takes advantage of large-scale pretrained language models and achieves great performance while being more parameter-efficient. However, existing prompt-tuning methods require tuning different pretrained language models for each specific task, and fail to utilize information across different tasks, which limits their applicability in complex situations. To address above issues, we propose PromptFusion, a unique prompt-based multi-task transfer learning approach which learns knowledge from multiple tasks and incorporates for the target task at low cost. The proposed approach first learns task-specific parameters with prompts to extract information individually, then, a fusion module is designed to aggregate information. Our method is interpretable because it can explain which sources of tasks are the crucial factors to influence the model decision on the target task. We also examine a more effective way to encapsulate information by incorporating parallel adapter modules into transformer layers, and this makes a linkage between parameter-efficient transfer learning methods. We empirically evaluate our methods on the GLUE benchmark and a variety of hard NLU tasks. The results show that our approach outperforms full fine-tuning and other parameter-efficient multi-task methods.

References

[1]

Brown T et al. Language models are few-shot learners Adv. Neural. Inf. Process. Syst. 2020 33 1877-1901

[2]

Carreras, X., Màrquez, L.: Introduction to the conll-2004 shared task: semantic role labeling. In: HLT-NAACL, pp. 89–97 (2004)

[3]

Carreras, X., Màrquez, L.: Introduction to the conll-2005 shared task: semantic role labeling. In: CoNLL-2005, pp. 152–164 (2005)

[4]

Caruana R Mach. Learn. 1997 28 1 41-75

[5]

Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)

[6]

Fedus W, Zoph B, and Shazeer N Switch transformers: scaling to trillion parameter models with simple and efficient sparsity J. Mach. Learn. Res. 2022 23 120 1-39

[7]

French RM Catastrophic forgetting in connectionist networks Trends Cogn. Sci. 1999 3 4 128-135

[8]

Ha, D., Dai, A.M., Le, Q.V.: Hypernetworks. In: ICLR (2017)

[9]

He, J., Zhou, C., Ma, X., Berg-Kirkpatrick, T., Neubig, G.: Towards a unified view of parameter-efficient transfer learning. In: ICLR (2022)

[10]

He, Y., et al.: Hyperprompt: prompt-based task-conditioning of transformers. In: International Conference on Machine Learning, pp. 8678–8690. PMLR (2022)

[11]

Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, pp. 2790–2799 (2019)

[12]

Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: EMNLP, pp. 3045–3059 (2021)

[13]

Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: ACL/IJCNLP, pp. 4582–4597 (2021)

[14]

Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021)

[15]

Liu, X., et al.: P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks. In: ACL, pp. 61–68 (2022)

[16]

Liu, X., He, P., Chen, W., Gao, J.: Multi-task deep neural networks for natural language understanding. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL, pp. 4487–4496 (2019)

[17]

Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

[18]

Mahabadi, R.K., Ruder, S., Dehghani, M., Henderson, J.: Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. In: ACL/IJCNLP, pp. 565–576 (2021)

[19]

McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Elsevier (1989)

[20]

Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., Gurevych, I.: AdapterFusion: non-destructive task composition for transfer learning. In: EACL, pp. 487–503 (2021)

[21]

Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: EMNLP-CoNLL, pp. 1–40 (2012)

[22]

Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I, et al. Language models are unsupervised multitask learners OpenAI Blog 2019 1 8 9

[23]

Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)

[24]

Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: EMNLP, pp. 2383–2392 (2016)

[25]

Sang, E.F.T.K., Meulder, F.D.: Introduction to the conll-2003 shared task: language-independent named entity recognition. In: HLT-NAACL, pp. 142–147 (2003)

[26]

Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)

[27]

Vu, T., Lester, B., Constant, N., Al-Rfou’, R., Cer, D.: SPoT: better frozen model adaptation through soft prompt transfer. In: ACL, pp. 5039–5059 (2022)

[28]

Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: ICLR (2019)

[29]

Weischedel, R., et al.: Ontonotes release 5.0 ldc2013t19. In: LDC (2013)

[30]

Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: EMNLP, pp. 38–45 (2020)

[31]

Zhuang F et al. A comprehensive survey on transfer learning Proc. IEEE 2021 109 1 43-76

Recommendations

Multiple task transfer learning with small sample sizes

Prognosis, such as predicting mortality, is common in medicine. When confronted with small numbers of samples, as in rare medical conditions, the task is challenging. We propose a framework for classification with data with small numbers of samples. ...
Distributed Multi-Task Relationship Learning
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Multi-task learning aims to learn multiple tasks jointly by exploiting their relatedness to improve the generalization performance for each task. Traditionally, to perform multi-task learning, one needs to centralize data from all the tasks to a single ...
A brief review on multi-task learning

Multi-task learning (MTL), which optimizes multiple related learning tasks at the same time, has been widely used in various applications, including natural language processing, speech recognition, computer vision, multimedia data processing, biomedical ...

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings

Neural Information Processing: 29th International Conference, ICONIP 2022, Virtual Event, November 22–26, 2022, Proceedings, Part I

Nov 2022

659 pages

ISBN:978-3-031-30104-9

DOI:10.1007/978-3-031-30105-6

Editors:
Mohammad Tanveer
Indian Institute of Technology Indore, Indore, India
,
Sonali Agarwal
Indian Institute of Information Technology - Allahabad, Prayagraj, India
,
Seiichi Ozawa
Kobe University, Kobe, Japan
,
Asif Ekbal
Indian Institute of Technology Patna, Patna, India
,
Adam Jatowt
University of Innsbruck, Innsbruck, Austria

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 13 April 2023

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 21 Dec 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Table of Contents