Title: Towards story generation
Abstract: Story generation is difficult to computationally formalize and evaluate, and there are many important questions to ask when tackling the problem. What should we consider as the base unit of a story (e.g., a sentence? a paragraph? a chapter?) What kind of data should we use to train these models (novels? short stories? overly simplistic mechanically-turked paragraphs?) Is any model architecture currently capable of producing long-form narratives that have some semblance of coherent discourse structure, such as plot arcs and character development? When evaluating the outputs of our models, can we do better than just asking people to rate the text based on vaguely defined properties such as "enjoyability"? In this talk, I'll discuss my lab's ongoing work on story generation by introducing a new dataset and evaluation method that we hope will spur progress in this area, and also describing fine-tuning strategies for large-scale Transformers that produce more coherent and stylistically-consistent stories. A major bottleneck of these models is their memory and speed inefficiency; as such, I'll conclude by discussing heavily-simplified Transformer language models that make training less expensive without sacrificing output quality.