Applications of Deep Neural Networks to Neurocognitive Poetics: A Quantitative Study of the Project Gutenberg English Poetry Corpus
With the advent of sophisticated computer technology, we increasingly see the use of computational techniques in the study of problems from a variety of disciplines, including the humanities. In a field such as poetry, where classic works are subject to frequent re-analysis over the course of years, decades, or even centuries, there is a certain demand for fresh approaches to familiar tasks, and such breaks from convention may even be necessary for the advancement of the field. Existing quantitative studies of poetry have employed computational techniques in their analyses, however, there remains work to be done with regards to the deployment of deep neural networks on large corpora of poetry to classify portions of the works contained therein based on certain features. While applications of neural networks to social media sites, consumer reviews, and other web-originated data are common within computational linguistics and natural language processing, comparatively little work has been done on the computational analysis of poetry using the same techniques. In this work, I begin to lay out the first steps for the study of poetry using neural networks. Using a convolutional neural network to classify author birth date, I was able to not only extract a non-trivial signal from the data, but also identify the presence of clustering within by-author model accuracy. While definitive conclusions about the cause of this clustering were not reached, investigation of this clustering reveals immense heterogeneity in the traits of accurately classified authors. Further study may unpack this clustering and reveal key insights about how temporal information is encoded in poetry. The study of poetry using neural networks remains very open but exhibits potential to be an interesting and deep area of work.