Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models

Jianmo Ni, Gustavo Hernandez Abrego, Noah Constant, Ji Ma, Keith Hall, Daniel Cer, Yinfei Yang

Abstract

We provide the first exploration of sentence embeddings from text-to-text transformers (T5) including the effects of scaling up sentence encoders to 11B parameters. Sentence embeddings are broadly useful for language processing tasks. While T5 achieves impressive performance on language tasks, it is unclear how to produce sentence embeddings from encoder-decoder models. We investigate three methods to construct Sentence-T5 (ST5) models: two utilize only the T5 encoder and one using the full T5 encoder-decoder. We establish a new sentence representation transfer benchmark, SentGLUE, which extends the SentEval toolkit to nine tasks from the GLUE benchmark. Our encoder-only models outperform the previous best models on both SentEval and SentGLUE transfer tasks, including semantic textual similarity (STS). Scaling up ST5 from millions to billions of parameters shown to consistently improve performance. Finally, our encoder-decoder method achieves a new state-of-the-art on STS when using sentence embeddings.

Anthology ID:: 2022.findings-acl.146
Volume:: Findings of the Association for Computational Linguistics: ACL 2022
Month:: May
Year:: 2022
Address:: Dublin, Ireland
Editors:: Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 1864–1874
Language:
URL:: https://aclanthology.org/2022.findings-acl.146
DOI:: 10.18653/v1/2022.findings-acl.146
Bibkey:
Cite (ACL):: Jianmo Ni, Gustavo Hernandez Abrego, Noah Constant, Ji Ma, Keith Hall, Daniel Cer, and Yinfei Yang. 2022. Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 1864–1874, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):: Sentence-T5: Scalable Sentence Encoders from Pre-trained Text-to-Text Models (Ni et al., Findings 2022)
Copy Citation:
PDF:: https://aclanthology.org/2022.findings-acl.146.pdf
Video:: https://aclanthology.org/2022.findings-acl.146.mp4
Code: additional community code
Data: GLUE, QNLI, ReQA, SentEval, SuperGLUE

PDF Cite Search Code Video