Abstract
A human author can write any length of story without losing coherence. Also, they always bring the story to a proper ending, an ability that current language models lack. In this work, we present the LongStory for coherent, complete, and length-controlled long story generation. LongStory introduces two novel methodologies: (1) the long and short-term contexts weight calibrator (CWC) and (2) long story structural positions (LSP). The CWC adjusts weights for long-term context Memory and short-term context Cheating, acknowledging their distinct roles. The LSP employs discourse tokens to convey the structural positions of a long story. Trained on three datasets with varied average story lengths, LongStory outperforms other baselines, including the strong story generator Plotmachine, in coherence, completeness, relevance, and repetitiveness. We also perform zero-shot tests on each dataset to assess the model’s ability to predict outcomes beyond its training data and validate our methodology by comparing its performance with variants of our model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We covered a test where alpha is also a learnable parameter rather than a constant hyperparameter in Sect. 4.3.2. In this test, the BERT-tiny determines \(\alpha \), \(\beta \), and \(\gamma \) independently, and finally divides by their sum, so that each variable adds up to 1.
- 2.
- 3.
We crawl this dataset from https://blog.reedsy.com/short-stories/..
- 4.
- 5.
Note that in-self-BLEU score is not the same as self-BLEU score [6]. The self-BLEU score has taken one whole generated document as a hypothesis and the others as references, which cannot represent inner repetitiveness.
References
Beltagy, I., Peters, M.E., Cohan, A.: Longformer: the long-document transformer. arXiv preprint arXiv:2004.05150 (2020)
Lin, Z., Riedl, M.O.: Plug-and-blend: a framework for plug-and-play controllable story generation with sketches. In: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, vol. 17, No. 1, pp. 58–65 (2021)
Fan, A., Lewis, M., Dauphin, Y.: Hierarchical neural story generation. arXiv preprint arXiv:1805.04833 (2018)
OpenAI. GPT-4 Technical Report (2023)
Peng, N., Ghazvininejad, M., May, J., Knight, K.: Towards controllable story generation. In: Proceedings of the First Workshop on Storytelling, pp. 43–49 (2018)
Rashkin, H., Celikyilmaz, A., Choi, Y., Gao, J.: PlotmaChines: outline-conditioned generation with dynamic plot state tracking. arXiv preprint arXiv:2004.14967 (2020)
Yang, K., Peng, N., Tian, Y., Klein, D.: Re3: generating longer stories with recursive reprompting and revision. arXiv preprint arXiv:2210.06774 (2022)
Tang, C., Lin, C., Huang, H., Guerin, F., Zhang, Z.: EtriCA: event-triggered context-aware story generation augmented by cross attention. arXiv preprint arXiv:2210.12463 (2022)
Brown, T., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems, vol. 33, pp. 1877–1901 (2020)
Lewis, M., et al.: Bart: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. arXiv preprint arXiv:1910.13461 (2019)
Rose, S., Engel, D., Cramer, N., Cowley, W.: Automatic keyword extraction from individual documents. Text Mining: Applications and Theory, 1–20 (2010)
Wang, S., Durrett, G., Erk, K.: Narrative interpolation for generating and understanding stories.arXiv preprint arXiv:2008.07466 (2020)
Yang, K., Klein, D., Peng, N., Tian, Y.: DOC: improving long story coherence with detailed outline control. arXiv preprint arXiv:2212.10077 (2022)
Kryściński, W., Rajani, N., Agarwal, D., Xiong, C., Radev, D.: Booksum: a collection of datasets for long-form narrative summarization. arXiv preprint arXiv:2105.08209 (2021)
Yao, L., Peng, N., Weischedel, R., Knight, K., Zhao, D., Yan, R.: Plan-and-write: towards better automatic storytelling. InP: roceedings of the AAAI Conference on Artificial Intelligence, vol. 33, no. 01, pp. 7378–7385 (2019)
Alabdulkarim, A., Li, W., Martin, L.J., Riedl, M.O.: Goal-directed story generation: augmenting generative language models with reinforcement learning. arXiv preprint arXiv:2112.08593 (2021)
Pradyumna, T., Murtaza, D., Lara, J. M., Mehta, A., Harrison, B.: Controllable neural story plot generation via reward shaping. In: Proceedings of the International Joint Conference Artificial Intelligence, pp. 5982–5988 (2019)
Guan, J., Huang, F., Zhao, Z., Zhu, X., Huang, M.: A knowledge-enhanced pretraining model for commonsense story generation. In: Transactions of the Association for Computational Linguistics, vol. 8, pp. 93–108 (2020)
Peng, X., Li, S., Wiegreffe, S., Riedl, M.: Inferring the reader: guiding automated story generation with commonsense reasoning. arXiv preprint arXiv:2105.01311 (2021)
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Lin, C.Y.: Rouge: a package for automatic evaluation of summaries. In: Text summarization branches out, pp. 74–81 (2004)
Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a method for automatic evaluation of machine translation. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics, pp. 311–318 (2002)
Safovich, Y., Azaria, A.: Fiction sentence expansion and enhancement via focused objective and novelty curve sampling. In: 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI), pp. 835–843. IEEE (2020)
Li, J., Bing, L., Qiu, L., Chen, D., Zhao, D., Yan, R.: Learning to write stories with thematic consistency and wording novelty. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, vol. 01, pp. 1715–1722 (2019)
Hu, Z., Chan, H.P., Liu, J., Xiao, X., Wu, H., Huang, L.: Planet: dynamic content planning in autoregressive transformers for long-form text generation. arXiv preprint arXiv:2203.09100 (2022)
Yang, K., Klein, D.: FUDGE: controlled text generation with future discriminators. arXiv preprint arXiv:2104.05218 (2021)
Sakaguchi, K., Bhagavatula, C., Bras, R. L., Tandon, N., Clark, P., Choi, Y.: proscript: partially ordered scripts generation via pre-trained language models. arXiv preprint arXiv:2104.08251 (2021)
Budzianowski, P., Vulić, I.: Hello, it’s GPT-2–how can I help you? towards the use of pretrained language models for task-oriented dialogue systems. arXiv preprint arXiv:1907.05774 (2019)
Welleck, S., Kulikov, I., Kim, J., Pang, R.Y., Cho, K.: Consistency of a recurrent language model with respect to incomplete decoding. arXiv preprint arXiv:2002.02492 (2020)
Zellers, R., et al.: Defending against neural fake news. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Guan, J., Mao, X., Fan, C., Liu, Z., Ding, W., Huang, M.: Long text generation by modeling sentence-level and discourse-level coherence. arXiv preprint arXiv:2105.08963 (2021)
Lewis, P., et al.: Retrieval-augmented generation for knowledge-intensive NLP tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020)
McCoy, R.T., Smolensky, P., Linzen, T., Gao, J., Celikyilmaz, A.: How much do language models copy from their training data? evaluating linguistic novelty in text generation using raven. Trans. Assoc. Comput. Linguist. 11, 652–670 (2023)
Acknowledgements
We thank anonymous reviewers for their constructive and insightful comments. K. Jung is with ASRI, Seoul National University, Korea. This work was supported by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government (MSIT) [NO.2022-0-00184, Development and Study of AI Technologies to Inexpensively Conform to Evolving Policy on Ethics & NO.2021-0-01343, Artificial Intelligence Graduate School Program (Seoul National University)].
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Park, K., Yang, N., Jung, K. (2024). LongStory: Coherent, Complete and Length Controlled Long Story Generation. In: Yang, DN., Xie, X., Tseng, V.S., Pei, J., Huang, JW., Lin, J.CW. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2024. Lecture Notes in Computer Science(), vol 14646. Springer, Singapore. https://doi.org/10.1007/978-981-97-2253-2_15
Download citation
DOI: https://doi.org/10.1007/978-981-97-2253-2_15
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-2252-5
Online ISBN: 978-981-97-2253-2
eBook Packages: Computer ScienceComputer Science (R0)