[go: up one dir, main page]
More Web Proxy on the site http://driver.im/ skip to main content
10.1007/978-3-031-30105-6_41guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

PromptFusion: A Low-Cost Prompt-Based Task Composition for Multi-task Learning

Published: 13 April 2023 Publication History

Abstract

Prompt-tuning takes advantage of large-scale pretrained language models and achieves great performance while being more parameter-efficient. However, existing prompt-tuning methods require tuning different pretrained language models for each specific task, and fail to utilize information across different tasks, which limits their applicability in complex situations. To address above issues, we propose PromptFusion, a unique prompt-based multi-task transfer learning approach which learns knowledge from multiple tasks and incorporates for the target task at low cost. The proposed approach first learns task-specific parameters with prompts to extract information individually, then, a fusion module is designed to aggregate information. Our method is interpretable because it can explain which sources of tasks are the crucial factors to influence the model decision on the target task. We also examine a more effective way to encapsulate information by incorporating parallel adapter modules into transformer layers, and this makes a linkage between parameter-efficient transfer learning methods. We empirically evaluate our methods on the GLUE benchmark and a variety of hard NLU tasks. The results show that our approach outperforms full fine-tuning and other parameter-efficient multi-task methods.

References

[1]
Brown T et al. Language models are few-shot learners Adv. Neural. Inf. Process. Syst. 2020 33 1877-1901
[2]
Carreras, X., Màrquez, L.: Introduction to the conll-2004 shared task: semantic role labeling. In: HLT-NAACL, pp. 89–97 (2004)
[3]
Carreras, X., Màrquez, L.: Introduction to the conll-2005 shared task: semantic role labeling. In: CoNLL-2005, pp. 152–164 (2005)
[4]
Caruana R Mach. Learn. 1997 28 1 41-75
[5]
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)
[6]
Fedus W, Zoph B, and Shazeer N Switch transformers: scaling to trillion parameter models with simple and efficient sparsity J. Mach. Learn. Res. 2022 23 120 1-39
[7]
French RM Catastrophic forgetting in connectionist networks Trends Cogn. Sci. 1999 3 4 128-135
[8]
Ha, D., Dai, A.M., Le, Q.V.: Hypernetworks. In: ICLR (2017)
[9]
He, J., Zhou, C., Ma, X., Berg-Kirkpatrick, T., Neubig, G.: Towards a unified view of parameter-efficient transfer learning. In: ICLR (2022)
[10]
He, Y., et al.: Hyperprompt: prompt-based task-conditioning of transformers. In: International Conference on Machine Learning, pp. 8678–8690. PMLR (2022)
[11]
Houlsby, N., et al.: Parameter-efficient transfer learning for NLP. In: ICML, pp. 2790–2799 (2019)
[12]
Lester, B., Al-Rfou, R., Constant, N.: The power of scale for parameter-efficient prompt tuning. In: EMNLP, pp. 3045–3059 (2021)
[13]
Li, X.L., Liang, P.: Prefix-tuning: optimizing continuous prompts for generation. In: ACL/IJCNLP, pp. 4582–4597 (2021)
[14]
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G.: Pre-train, prompt, and predict: a systematic survey of prompting methods in natural language processing. arXiv preprint arXiv:2107.13586 (2021)
[15]
Liu, X., et al.: P-tuning: prompt tuning can be comparable to fine-tuning across scales and tasks. In: ACL, pp. 61–68 (2022)
[16]
Liu, X., He, P., Chen, W., Gao, J.: Multi-task deep neural networks for natural language understanding. In: Korhonen, A., Traum, D.R., Màrquez, L. (eds.) ACL, pp. 4487–4496 (2019)
[17]
Liu, Y., et al.: Roberta: a robustly optimized BERT pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
[18]
Mahabadi, R.K., Ruder, S., Dehghani, M., Henderson, J.: Parameter-efficient multi-task fine-tuning for transformers via shared hypernetworks. In: ACL/IJCNLP, pp. 565–576 (2021)
[19]
McCloskey, M., Cohen, N.J.: Catastrophic interference in connectionist networks: the sequential learning problem. In: Psychology of Learning and Motivation, vol. 24, pp. 109–165. Elsevier (1989)
[20]
Pfeiffer, J., Kamath, A., Rücklé, A., Cho, K., Gurevych, I.: AdapterFusion: non-destructive task composition for transfer learning. In: EACL, pp. 487–503 (2021)
[21]
Pradhan, S., Moschitti, A., Xue, N., Uryupina, O., Zhang, Y.: Conll-2012 shared task: modeling multilingual unrestricted coreference in ontonotes. In: EMNLP-CoNLL, pp. 1–40 (2012)
[22]
Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I, et al. Language models are unsupervised multitask learners OpenAI Blog 2019 1 8 9
[23]
Raffel, C., et al.: Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 21, 140:1–140:67 (2020)
[24]
Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: Squad: 100,000+ questions for machine comprehension of text. In: EMNLP, pp. 2383–2392 (2016)
[25]
Sang, E.F.T.K., Meulder, F.D.: Introduction to the conll-2003 shared task: language-independent named entity recognition. In: HLT-NAACL, pp. 142–147 (2003)
[26]
Vaswani, A., et al.: Attention is all you need. In: NIPS, pp. 5998–6008 (2017)
[27]
Vu, T., Lester, B., Constant, N., Al-Rfou’, R., Cer, D.: SPoT: better frozen model adaptation through soft prompt transfer. In: ACL, pp. 5039–5059 (2022)
[28]
Wang, A., Singh, A., Michael, J., Hill, F., Levy, O., Bowman, S.R.: GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: ICLR (2019)
[29]
Weischedel, R., et al.: Ontonotes release 5.0 ldc2013t19. In: LDC (2013)
[30]
Wolf, T., et al.: Transformers: state-of-the-art natural language processing. In: EMNLP, pp. 38–45 (2020)
[31]
Zhuang F et al. A comprehensive survey on transfer learning Proc. IEEE 2021 109 1 43-76

Recommendations

Comments

Please enable JavaScript to view thecomments powered by Disqus.

Information & Contributors

Information

Published In

cover image Guide Proceedings
Neural Information Processing: 29th International Conference, ICONIP 2022, Virtual Event, November 22–26, 2022, Proceedings, Part I
Nov 2022
659 pages
ISBN:978-3-031-30104-9
DOI:10.1007/978-3-031-30105-6
  • Editors:
  • Mohammad Tanveer,
  • Sonali Agarwal,
  • Seiichi Ozawa,
  • Asif Ekbal,
  • Adam Jatowt

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 13 April 2023

Author Tags

  1. Prompt
  2. Multi-task
  3. Transfer learning
  4. Parameter-efficient

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 21 Dec 2024

Other Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media