🦜🌎 LLMs as Polyglots: On the Emergence of Multilingual Inference Abilities

Code for CS4248 Group Project (AY22/23 Sem 2) [code, pdf, MLeSNLI on HuggingFace]

By Rishabh Anand, Ryan Chung Yi Sheng, Huaiyu Deng, Zhi Bin Cai, Tan Rui Quan

WARNING: when running the code, we advise having access to a GPU that supports bfloat16 operations with at least 40GB of RAM. We recommend an A100 if possible since it meets all these requirements!

Abstract

Cross-lingual transfer refers to the ability of language models to learn from a set of well-documented languages and perform tasks in unseen, possibly low-resource languages. The community is yet to study the efficacy of LMs on logical inference in a cross-lingual setting. In this work, we make two contributions: we release Multilingual e-SNLI (MLe-SNLI), a cross-lingual dataset built on top of e-SNLI (Camburu et al., 2018) comprising of samples translated to Spanish, Dutch, French, and German grounded by their similarity to English. We also propose a new prompting paradigm to study this emergence of LMs by finetuning Flan-T5-Large (Chung et al., 2022) on MLe-SNLI and empirically demonstrate the emergence of multilingual inference skills by comparing it to a zero-shot Flan-T5-Large solely pretrained on ∼1.8K English tasks. Our experiments demonstrate that a finetuned Flan-T5-Large achieves significantly higher classification and explanation accuracies compared to a zero-shot Flan-T5-Large on an unseen lan- guage. Specifically, our model achieved a clas-sification accuracy of 75% and an explanation accuracy of 51%, whereas the zero-shot model scored 64% and 40% respectively.

Folder Structure

|-findings/
|-Scripts/
|-src/
|--notebooks/
|--python/
|-translated_data/

findings/ contains CSV files and log files that contain the results documented in our report
Scripts/ contains code you can run on the SOC Cluster
code/ contains code to train and run inference (both in Python and Jupyter Notebook format)
translated_data/ contains the translated e-SNLI samples for train, test, and dev set (ie, MLe-SNLI)

You can find our final CS4248 group project report at CS4248_Group19_Final_Report.pdf!!

Running Instructions

Notebooks

If running the notebooks, we recommend,

creating a Google Drive with translated_data inside
importing the notebook of choice as a Google Colab notebook (mounts to your drive folder)
- Notebooks tagged _FT are for finetuning
- Notebooks tagged _Eval are to evaluate the respective finetuned model
switching to GPU mode
running as-is

Notes:

The notebook should run without errors if the filepaths are fixed according to your GDrive setup. If you run into any issues, please feel free to drop an Issue.
We recommend configuring a Premium Class + High RAM GPU on Google Colab; more often than not, with Colab+, you'll be assigned an A100!

Scripts

If running the python scripts on the SOC Cluster, please follow these instructions:

Set up the env by running the following lines:

python3 -m venv mlesnli_train
source mlesnli_train/bin/activate
pip3 install -r requirements.txt

Make the appropriate changes to the following files:
- experiment name and output/save directories in run_train_job.sh and run_inference_job.sh found in Scripts/
- model name and filepaths in train.py and inference.py found in src/python/.
Run chmod +x for run_train_job.sh and run_inference_job.sh
Run ./Scripts/run_train_job.sh. Once done, run ./Scripts/run_inference_job.sh

Notes:

The configurations that can be changed have # change comments beside them.
For the models, you can choose between google/flan-t5-large, google/flan-t5-base, and google/flan-t5-small.
We do not include the DeepSpeed code since the SOC Cluster has long-enough jobs to handle our training runs.
The scripts auto-configure the cluster to assign us an A100 80GB machine

Contributing

If you encounter any difficulties, please raise an Issue. If you have any suggestions or improvements, raise a PR!

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🦜🌎 LLMs as Polyglots: On the Emergence of Multilingual Inference Abilities

Abstract

Folder Structure

Running Instructions

Notebooks

Scripts

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
Scripts		Scripts
findings		findings
src		src
translated_data		translated_data
.gitignore		.gitignore
CS4248_Group19_Final_Report.pdf		CS4248_Group19_Final_Report.pdf
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

License

rish-16/cs4248-project

Folders and files

Latest commit

History

Repository files navigation

🦜🌎 LLMs as Polyglots: On the Emergence of Multilingual Inference Abilities

Abstract

Folder Structure

Running Instructions

Notebooks

Scripts

Contributing

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages