Dillon, the fictional, close-knit rural Texas town portrayed in the television series Friday Night Lights, was presented as a space whose inhabitants obsess over three things: God, family, and football. If the Mapping Texts project, a National Endowment for the Humanities-sponsored joint collaboration between the University of North Texas and Stanford University's Bill Lane Center for the American West is to be believed, this portrayal might not rest too far afield the truth.
This project first came to my notice while I was perusing the domain-b.com website. There, I found the following post. Mapping Texts' mission statement describes it as an "experiment with new methods for finding and analyzing meaningful patterns embedded within massive collections of historical newspapers." Utilizing a corpus of about 232,500 pages of historical newspaper digitized by the Texas Digital Newspaper Program in connection with the Chronicling America project, this project comprises "two interactive visualizations that allow you [sic] to explore both the quality of these digitized newspapers and the major language patterns."
The first interactive visualization, Assessing Newspaper Quality, "plots the quantity and quality of 232,567 pages of historical Texas newspapers, as they spread out over time and space. The graphs plot the overall quantity of information available by year and the quality of the corpus (by comparing the number of words we can recognize to the total number scanned). The map shows the geography of the collection, grouping all newspapers by their publication city, and can show both the quantity and quality of the newspapers from various locations. Clicking on a particular city will provide a detailed view of the individual newspapers, where you [sic] can examine both the quantity and quality of information. A timeline of historical events related to Texas is also available for context."
The second interactive visualization, Assessing Language Patterns, "plots the language patterns embedded in 232,567 pages of historical Texas newspapers, as they evolved over time and space. For any date range and location, you [sic] can browse the most common words (word counts), named entities (people, places, etc), and highly correlated words (topic models)."
When a visitor clicks on the icon for "Modern Texas," a map of the state appears with a visualization displaying both the location and quantity of digitized newspaper available for analysis. As the visitor scrolls through eras, there is also a box listing the 10 most discussed topics in newspapers during that time span.
It should be no surprise that government, business, and politics make frequent appearances, appearance, but it is interesting to note how often--especially in more rural locations, the list is dominated by sports, family, and church. As the domain-b.com article notes, "a quick look through this history reveals that coverage of sports elbows its way into the top 10 in the early 20th century."
Mapping Texts is perfect example of how Digital Humanities can utilize Big Data. As we discussed in our blog post from last week, Big Data's Big Horizons, Digital Humanities is benefiting from the ability to analyze larger data sets than previously available. This not only shifts the material available for inquiry, it can also fundamentally shift the inquiries themselves.
What does it say about these Texas towns that sports, family, and church feature so prominently in their news coverage? How would this compare to, say, rural Connecticut or San Francisco? What does it say about the collective interests, hopes, fears, and investments of the people in these areas? What does it indicate about the way the press both reflects and contours individual and group thinking?
Mapping Texts is a fantastic use of Big Bata, but it also shows the promise Big Data has for creating new avenues of humanist inquiry. It is a powerful new hermeneutic tool, but it must be deployed by skilled thinkers in service of worthwhile questions to make it meaningful.