SPIRS Sarcasm Dataset

SPIRS is a unique dataset of 15000 sarcastic tweets. SPIRS was collected using reactive supervision, a new data capturing method. Reactive supervision allows the collection of both intended sarcasm and perceived sarcasm texts.

SPIRS stands for Sarcasm, Perceived and Intended, by Reactive Supervision :)

To find out more about SPIRS and reactive supervision, check out the reactive supervision paper, or read the Medium article.

Use this repository to download SPIRS. The repository includes the following data files:

SPIRS-sarcastic-ids.csv the sarcastic tweet IDs (15000 "positive" samples)
SPIRS-non-sarcastic-ids.csv the non-sarcastic tweet IDs (15000 "negative" samples)

Addition fields for each sarcastic tweet include the sarcasm perspective (intended/perceived), author sequence, and contextual tweet IDs (cue, oblivious, and eliciting tweets). Additional information is available in the reactive supervision paper.

To comply with Twitter's privacy policy, the dataset files include only the tweet IDs. To fetch the tweet texts, follow these steps:

Install the latest version of Tweepy:

pip3 install tweepy
Rename credentials-example.py to credentials.py
Add your Twitter API credentials by editing credentials.py
Run the script:

python3 fetch-tweets.py

The script will fetch the texts and create two new files, one for sarcastic and the other for non-sarcastic tweets:

SPIRS-sarcastic.csv
SPIRS-non-sarcastic.csv

Citation

Kindly cite the paper using the following BibTex entry:

@inproceedings{
    shmueli:reactive-supervision, 
    title={Reactive Supervision: A New Method for Collecting Sarcasm Data}, 
    author={Shmueli, Boaz and Ku, Lun-Wei and Ray, Soumya}, 
    booktitle = "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing", 
    year = "2020", 
    publisher = "Association for Computational Linguistics"
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

SPIRS Sarcasm Dataset

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
README.md		README.md
SPIRS-non-sarcastic-ids.csv		SPIRS-non-sarcastic-ids.csv
SPIRS-sarcastic-ids.csv		SPIRS-sarcastic-ids.csv
credentials-example.py		credentials-example.py
fetch-tweets.py		fetch-tweets.py

soumyaray/SPIRS

Folders and files

Latest commit

History

Repository files navigation

SPIRS Sarcasm Dataset

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages