Introduction
This project analyzes the Life and Adventures of Frank and Jesse James text using Voyant Tools to explore how this narrative constructs the identities of Frank and Jesse James. I thought it would be interesting to analyze this text specifically because of the connection to Northfield. I was interested in examining how this text portrays the outlaws whose actions are tied to the town’s history.
Sources
This dataset/.txt file comes from Project Gutenberg’s public domain edition of Life and Adventures of Frank and Jesse James. There was some data cleaning I felt necessary, as the files from Project Gutenberg always include header/footer metadata, licensing information, and illustration captions. Before uploading the .txt file to Voyant Tools, I removed all the header and footer text and deleted all of the licensing material from the file using TextEdit. The final cleaned corpus contained around 85,000 total words.
Processes
I used Voyant Tools to conduct distant reading of the text. Specifically looking at the Cirrus, the TermsBerry, and the Trends visualization. Using these tools, I was able to get a good idea of the frequency of the terms and the relationships between these words. The Cirrus word cloud helped me quickly identify the most frequently used words in the corpus. The TermsBerry visualization allowed me to examine how frequently used words are connected to one another. The Trends visualization divided the text into segments to track how specific words rise and fall in usage across the narrative.
Presentation
The Voyant Tools interface was embedded directly into the webpage using a custom HTML code, allowing readers to interact with the analysis as opposed to only interpreting a screenshot of the tools. The layout of the page was kept simple and minimal to make it easier to read and navigate the tools. The embedded tool enables readers to explore the different trends seen throughout this text.
Significance
This analysis shows how repetition shapes the story of the James brothers. The high frequency of terms such as “robbery”, “bank”, and “outlaws” really demonstrates the identity of these brothers. Additionally, there is a difference in frequencies between Frank and Jesse, suggesting an imbalance in narrative emphasis, which may help shape the cultural memory of one brother over the other. I personally have only heard of Jesse James, which backs the analysis as Jesse shows up almost one hundred more times that Frank. This approach reflects the practice of distant reading, which uses quantitative tools to reveal patterns while recognizing that these tools cannot fully capture the tone or context of this text. This project shows how digital analysis complements close reading and makes literary interpretation interactive and publicly accessible.