Code for CS4248 Group Project (AY22/23 Sem 2) [code
, pdf
, MLeSNLI on HuggingFace
]
By Rishabh Anand, Ryan Chung Yi Sheng, Huaiyu Deng, Zhi Bin Cai, Tan Rui Quan
WARNING: when running the code, we advise having access to a GPU that supports
bfloat16
operations with at least 40GB of RAM. We recommend an A100 if possible since it meets all these requirements!
Cross-lingual transfer refers to the ability of language models to learn from a set of well-documented languages and perform tasks in unseen, possibly low-resource languages. The community is yet to study the efficacy of LMs on logical inference in a cross-lingual setting. In this work, we make two contributions: we release Multilingual e-SNLI (MLe-SNLI), a cross-lingual dataset built on top of e-SNLI (Camburu et al., 2018) comprising of samples translated to Spanish, Dutch, French, and German grounded by their similarity to English. We also propose a new prompting paradigm to study this emergence of LMs by finetuning Flan-T5-Large (Chung et al., 2022) on MLe-SNLI and empirically demonstrate the emergence of multilingual inference skills by comparing it to a zero-shot Flan-T5-Large solely pretrained on ∼1.8K English tasks. Our experiments demonstrate that a finetuned Flan-T5-Large achieves significantly higher classification and explanation accuracies compared to a zero-shot Flan-T5-Large on an unseen lan- guage. Specifically, our model achieved a clas-sification accuracy of 75% and an explanation accuracy of 51%, whereas the zero-shot model scored 64% and 40% respectively.
|-findings/
|-Scripts/
|-src/
|--notebooks/
|--python/
|-translated_data/
findings/
contains CSV files and log files that contain the results documented in our reportScripts/
contains code you can run on the SOC Clustercode/
contains code to train and run inference (both in Python and Jupyter Notebook format)translated_data/
contains the translated e-SNLI samples for train, test, and dev set (ie, MLe-SNLI)
You can find our final CS4248 group project report at
CS4248_Group19_Final_Report.pdf
!!
If running the notebooks, we recommend,
- creating a Google Drive with
translated_data
inside - importing the notebook of choice as a Google Colab notebook (mounts to your drive folder)
- Notebooks tagged
_FT
are for finetuning - Notebooks tagged
_Eval
are to evaluate the respective finetuned model
- Notebooks tagged
- switching to GPU mode
- running as-is
Notes:
- The notebook should run without errors if the filepaths are fixed according to your GDrive setup. If you run into any issues, please feel free to drop an Issue.
- We recommend configuring a Premium Class + High RAM GPU on Google Colab; more often than not, with Colab+, you'll be assigned an A100!
If running the python scripts on the SOC Cluster, please follow these instructions:
- Set up the env by running the following lines:
python3 -m venv mlesnli_train
source mlesnli_train/bin/activate
pip3 install -r requirements.txt
-
Make the appropriate changes to the following files:
- experiment name and output/save directories in
run_train_job.sh
andrun_inference_job.sh
found inScripts/
- model name and filepaths in
train.py
andinference.py
found insrc/python/
.
- experiment name and output/save directories in
-
Run
chmod +x
forrun_train_job.sh
andrun_inference_job.sh
-
Run
./Scripts/run_train_job.sh
. Once done, run./Scripts/run_inference_job.sh
Notes:
- The configurations that can be changed have
# change
comments beside them. - For the models, you can choose between
google/flan-t5-large
,google/flan-t5-base
, andgoogle/flan-t5-small
. - We do not include the DeepSpeed code since the SOC Cluster has long-enough jobs to handle our training runs.
- The scripts auto-configure the cluster to assign us an A100 80GB machine
If you encounter any difficulties, please raise an Issue. If you have any suggestions or improvements, raise a PR!