TeenQA is a large-scale topic-specific QA dataset constructed from Quora, containing ~70K teens' commonly encountered questions and ~2M community-given answers. TeenQA is intended for non-commercial research purposes only to promote advancement in the fields of E-bibliotherapy and NLP (e.g. question generation, reading comprehension and multi-document summarization).
The dataset is provided "as is" without warranty. Please contact us at xinyx16@mails.tsinghua.edu.cn if you own any of the documents made available but do not want them in this dataset.
See paper: Generating Instructive Questions from Multiple Articles to Guide Reading in E-Bibliotherapy
- Contruction process
- Topics Distribution
- Question types
- An example
(ALL data will be made public once the paper is accepted.)
- All data TeenQA_all_697105.json
- Training set TeenQA_train_694836.json
- Validation set TeenQA_val_1269.json
- Testing set TeenQA_test_1000.json
@article{TeenQA,
author = {Yunxing Xin and
Xiaohao He and
Ling Feng},
title = {Generating Instructive Questions from Multiple Articles to Guide Reading in E-Bibliotherapy},
journal = {},
year = {2018}
}