Skip Navigation

The THL will retire in July 2013. Please visit the Retiring the THL Project for more information.

digital humanities

Creating Collaborative Data Space: a Profile of the New Social Sciences Data Laboratory

Why write about a social sciences data laboratory on a humanities blog?

Looking at lists of Berkeley departments and programs in the Humanities and in the Social Sciences, we might wonder why history is in the social sciences division, while the history of art is in the humanities; why anthropology is a social science department, while Ancient History and Mediterranean Archaeology is a humanities program; why Linguistics is not found in the same division as English, Slavic, Chinese, French, German, Japanese.... From the outside, such distinctions can be invisible. For undergraduates in engineering, for instance, all of those different fields are lumped together into one “breadth” requirement, often referred to simply as H/SS. There are reasons for all of this, I’m sure. But are they good ones? I won’t dare answer that question here.

Nevertheless, we might think of several generations of anti-positivist and postmodern scholars who have questioned the nature of the social sciences, asking, "just how scientific are they?" And more recently, some literature scholars have sought to put their fields on a more empirical foundation using digital methodologies (that is, by quantifying and measuring text, and by testing and verifying conclusions). The Townsend Center has stressed interdisciplinarity through its core programs for years, and interdisciplinary programs have been a prominent feature of undergraduate education for at least two decades. Many of these trends have brought plenty of controversy. Yet, from where I sit, it looks as though the social sciences and the humanities—or at least certain segments among them—are closer together, in methods and in theory, than they have been for a long time, and disciplinary boundaries are perhaps fuzzier than our institutional structure would suggest. As we move forward, it is possible that digital methods and data analysis will have a role to play in all of this.

This is why the new Social Sciences Data Laboratory (D-Lab) is an appropriate topic for a humanities blog, and why, as a digital humanities blogger, I am excited by what is being done there. (In the interest of full disclosure, I should say that in my capacities as Digital History Coordinator, I have been working closely with D-Lab from the beginning, but nonetheless). D-Lab has emerged to meet an important need: that of researchers to keep apace of rapidly changing technologies of data or information.

D-Lab was created following a report from the office of the vice chancellor for research Graham Fleming, which called for a more dynamic and flexible laboratory-based approach to social science research, amid the university-wide reorganization stimulated by recent loses in state funding. Led by historian of science Cathryn Carson, Associate Dean of the Social Sciences, D-Lab results from a reorganization of several groups, including UC Data, an archive and a provider of large-scale databases that was formerly part of the Survey Research Center that closed in 2010, and elements of the Social Science Computing Laboratory, a computing facility which dealt mainly with survey data.

This is more than just an institutional shuffle. In order to meet new technological and research demands that senior faculty do not always understand, D-Lab has taken a start-up approach, heavily involving graduate students at every stage of the process. Dean Carson commented on this approach in a D-Lab blog:

This isn’t always how Berkeley has operated. We are betting we can institutionalize this kind of responsiveness, even in the middle of the university’s long-established structures. We partly look outside for models—for fluid, adaptable organizations in the social and private sectors that are attentive to fitting into a scene that is constantly in flux. We also look inside. One thing crystallized out of the D-Lab design process: Berkeley’s graduate students are vastly talented and incredibly inventive. They are D-Lab’s first target audience, as well as the core of our start-up team.—Cathryn Carson, D-Lab as a Start-up

Only time will tell whether this graduate-student led, start-up style research institute is successful, but the early signs are good. In the two months since it opened, the workshops have been frequent and well-attended, drawing faculty, staff, and librarians, as well as graduate students. And a core community of graduate student researchers is growing up in the space.

In the context of a certain amount of convergence among the various disciplines of the humanities and social sciences, rather than challenging institutional boundaries, D-Lab takes an agnostic approach. Attending a D-Lab event, you will hear this in their introduction:  “D-Lab is a about social sciences and data, but it polices the boundaries of neither.” This theoretical and disciplinary agnosticism is intended to keep D-Lab flexible and responsive to the emerging needs of researchers in a computing society, for whom the current disciplinary boundaries, both in and between the humanities and the social sciences, may not be so helpful for engaging with data.

On April 23, D-Lab will participate in "What Can Digital Humanities do for You?" co-sponsored by the Office of the Dean of the Arts and Sciences, The Townsend Center, the Berkeley Center for New Media, and Berkeley IS&T. The event will be followed by D-Lab's end of the semester celebration, "Opening the Black Box."

 

Three Weeks with Dan Cohen: A DH Microcosm

Last year, Professor Tom Laqueur asked me to recommend someone to invite to apply for a Townsend Avenali Resident Fellowship in history who would enhance our efforts in the area of digital humanities and computational social sciences. Dan Cohen was an obvious choice. Currently the Director of the Roy Rosenzweig Center for History and New Media at George Mason University, Professor Cohen is a true leader in the digital humanities and computational social sciences, so we were delighted that he agreed to come.

In anticipation of his residency, my colleague Rochelle Terman wrote about Professor Cohen’s contributions in a post that can be found here. Rather than repeat what has already been said, this post will reflect on Professor Cohen’s visit.

Professor Cohen’s invitation to be the Avenali Resident Fellow in history fits within a larger effort to build a digital humanities and computational social sciences community across disciplines, units, and fields here at UC Berkeley, so that researchers might be able to take better advantage of the rich resources and many talents spread about campus.  To the late Roy Rosenzweig, Professor Cohen’s predecessor and founder of CHNM, community building was necessary to fulfill the promise of digital technology. In his own work at the center, Professor Cohen carries this torch. His ‘big-tent,’ inclusive approach to digital methods, his vast knowledge of resources and projects, and his friendly demeanor all helped to further invigorate us in our efforts to create connections among Berkeley's immense but often disjointed resources and talent.

During his stay, Professor Cohen gave a public lecture on the changing nature of scholarly research and communication for the Computing and the Practice of History series, he led a discussion on Digital Humanities institutions for a new Townsend Brown Bag series (information about future events can be found here), he conferred with our Townsend-sponsored Digital Humanities Working Group, and he met individually with faculty, librarians, and graduate students. In addition, a major portion of his visit was dedicated to leading a special seminar, which was hosted in the new Social Sciences Data Lab (D-Lab).

In six meetings over three weeks, this seminar was a crash-course on digital scholarly methods, covering first-principles and methodological issues in several areas while introducing a panoply of new research tools. As it turns out, in ways both anticipated and surprising, Professor Cohen’s digital methods seminar proved to be a microcosm of the state of the digital humanities and computational social sciences, both here at Berkeley and in the world at large.

The seminar’s participants included junior faculty and graduate students from several fields of humanities and social sciences, from computer science, and from the School of Information, alongside librarians from the Bancroft and the Law Library, and IST staff. Among this group, computational abilities were as varied as fields: some participants were very much still beginners, others had robust technical skill sets, and at least one has recently been offered a job at Google.

Such diversity is no accident. As Professor Cohen has himself emphasized on many occasions, digital scholarly methods require multiple skill sets from several disciplines and fields. Hence, CHNM has computing specialists and designers working alongside its historians, and the Maryland Institute for Technology in the Humanities (MITH) is closely allied with the University of Maryland’s libraries. To wit: digital scholarly methods demand interdisciplinary collaboration.

Aiming to foster that sort of exchange, D-Lab graciously opened its freshly hung doors a bit early to welcome us for Professor Cohen’s class, and the lingering traces of construction in D-Lab suited our seminar well. After all, in the words of Timothy Hitchcock delivering the inaugural Computing and the Practice of History lecture in 2011, “we are halfway through a revolution.” Indeed, there we were, sitting around a cluster of borrowed tables in an unfinished lab, its wires still not hidden away, paint still wet on the walls, Professor Cohen reminding us that much work remains to be done.

It was truly a pleasure to join Professor Cohen for three weeks in this digital humanities microcosm, with all of our diverse interests and backgrounds, exploring the contours of a still very unfinished field.

What Are Patterns For? Big Data and Its Discontents

Around the end of each year, pundits play the parlor game of choosing a “word of the year.” Often, it’s some flash-in-the-pan phrase that seems to capture the zeitgeist. This year’s contenders included “YOLO” and “Eastwooding.” (Visit the Boston Globe for a roundup.) For Geoff Nunberg, Professor at UC Berkeley’s School of Information and pundit in his own right at NPR, that word was “big data.” 

Indeed, in certain circles it’s hard to visit a web page without seeing a reference to this new buzzword. We’ve blogged about big data several times: on the occasion of the DataEDGE conference, on big data’s relationship to civil rights, and with regard to the election. Professor Marti Hearst, also at the School of Information, recently even taught a course called “Analyzing Big Data With Twitter.” While the term is difficult to define, we have previously written that it “describe[s] info sets so large and so complex that available database management tools cannot handle them.”
 
So why is all this number crunching interesting people in the humanities? Big data is just the domain of statisticians and the people interested in their analyses, like sociologists and marketers, right?
 
Not so fast. Enter bona fide humanists like Franco Moretti and his team at the Stanford Literature Lab. Moretti argues that “close reading,” the traditional bread and butter of literary study, makes it impossible to see the big picture. How useful is reading, say, 100 Victorian texts closely when there were thousands and thousands published during that era? Better to let a computer read them, he says, and then analyze the results. Moretti is talking about mining the big data of Victorian novels for trends in the same way that marketers look at sales data to figure out how often people who by peanut butter also buy jelly.
 
And Moretti isn’t the only scholar trading the bound tome for the keyboard. A recent New York Times article outlines this new literary trend. At Harvard, Jean-Baptiste Michel and Erez Lieberman Aiden are poking through Google Books to trace word use over time. Based on how frequently references to Freud appear, Michel and Aiden determine that, as the Harvard Gazette says, the famous psychoanalyst “is more deeply ingrained in our collective subconscious than ‘Galileo,’ ‘Darwin,’ or ‘Einstein.’ I supposed the subconscious is Freud’s turf, after all.
 
Unsurprisingly, literary critics’ use of big data has drawn the ire of close readers, traditionally proponents of, shall we say, “small data.” A close reader might notice, for instance, that the term “big data” is silly, since data is a mass noun. It's like saying “big grass” if you’re talking about a lot of grass. Stanley Fish, a particularly celebrated close reader, takes issue with scholars like Moretti, arguing that the “data mining” approach to criticism takes away the ability of the critic to have a hypothesis.
 
Even in the realm of the “hard sciences,” researchers acknowledge the limitations of the computer when faced with complex literary texts. Padraig Mac Carron and Ralph Kenna, scientists at the University of Coventry, entered information on books like Beowulf and Les Misérables into computers by hand in order to study the social networks created in the texts. According to Inside Science, they found that entering the data in this way was “more effective” because a computer would have a hard time distinguishing between “friendly” and “unfriendly” relationships. While the machine can find a relationship, it takes a human reader to understand the meaning of it.
 
Fish, Mac Carron and Kenna are pointing to a major criticism of “big data” methods in general: how can we determine the value of the trends the computer can reveal to us? Is it important that the Gothic writers use the word “the” more often than people from other periods, or is it just a meaningless quirk? Do more frequent references to Freud in the Google Books archives necessarily mean that he has a larger place in our culture than Darwin? As Nunberg concludes on NPR, even with these new analytic tools, people in all domains  “will still feel the need to sort out the causes from the correlations—still asking the old question, what are patterns for?”
 

[Image Credit]

The Text Encoding Initiative: Allowing Preservation and Access to our Textual Heritage through Digital Means

By developing and maintaining an encoding standard for the digitization of text, the Text Encoding Initiative (TEI) is helping to deliver on the the internet's promise to democratize access to the world’s cultural—in this case, textual—heritage.

Viewed from this author’s lay-person’s perspective, libraries, museums, archives, and other custodians of the world’s textual heritage exist to serve at least two crucial functions when it comes to their rare collections: access and preservation. But there is a tension between the two, because institutions often must limit the access to their special collections in order to preserve them. Each time a researcher visits the Bancroft Library at the University of California, Berkeley, to view the Tebtunis Papyri collection, for example, the millenniums-old papyrus pith is susceptible to damage. What’s more, the Folger Shakespeare Library in Washington D.C. doesn’t want kids on field trips handling a 1604 edition of Hamlet. And no adolescent could match the perniciousness of well-informed thieves such as Farhad Hakimzadeh, the businessman and antiquarian who from 2003 to 2009 used a scalpel to extract pages from some 150 rare books at the British Library and Oxford's Bodleian Library (see this BBC article for more). But even in the face of these dangers, preservation must not occlude access. After all, one of the goals of the American Library Association is “to ensure access to information by all” (ALA “Mission and History”).

Digital technology promises to relieve this tension by granting access to a very wide audience with only a minimum of risk to the materials. So the Bancroft works with the Advanced Papyrological Information System (APIS) project to make their papyri available; the Folger Library makes its collections available and offers virtual tours through its website; and the Fihrist Islamic Manuscripts Catalogue Online offers access to materials (perhaps minus a few pages) from the British Library, the Bodleian, and several other UK institutions. All of these projects use TEI standards.

What is TEI? The TEI is an institution: a consortium of universities, libraries, archives, and others dedicated to the development of an encoding standard for the preservation, description, and publication of digital editions. As such, it holds an annual meeting, maintains discussions and working groups, and, most importantly, publishes and maintains guidelines for the encoding of text, also called TEI. Confused? Put another way, the TEI is both the name of an organization and the name of an encoding language produced by that organization. In the 1980s, incompatible systems for encoding and representing texts were multiplying, a situation which “was inhibiting the development of the full potential of computers to support humanistic inquiry” (TEI “Origins”). In other words, under such circumstances, computers could not increase access in the way that people had hoped. Also, this proliferation of standards aggravated the problem of preservation: what would happen when a standard became obsolete? Who would ensure the long-term preservation of these new documents in the years, decades, and even centuries to come? These important questions were not being answered. So, in 1987, an international group of researchers seeking to remedy these problems created the foundations of the Text Encoding Initiative. A first draft of its guidelines were published in 1990, and the TEI Consortium was formed in 1999 (ibid).

The point of a standardization like the TEI language is to allow for inter-operability among projects, and between projects and users, which should increase access to the encoded texts. And the TEI Consortium exists to make sure that the guidelines continue to be developed and maintained into the future, thus helping to preserve the new, digital editions of these important human creations. This important work will help the custodians of human culture better achieve the dual aims of preservation and access.

Nevertheless, I would be remiss if I did not end this piece with a note of caution. As we continue through this transition to the digital, we must never accept the misguided notion that a digital edition is a perfect and adequate substitute for the material artifact it describes. It is not. The human and scholarly value of the physical carriers of our textual heritage cannot be overestimated. This is well known by those who work most closely with them, including the members of the TEI, but might too easily be forgotten (or ignored) by those who control their purses. It is important that those who provide the funding for the conservation of our textual heritage—governments, institutions, and individual donors—not be tricked by seeing Shakespeare on the web into thinking that the folios themselves are no longer necessary, lest these irreplaceable relics of the human past fade into "The Nothing" before our very eyes. The TEI exists to prevent such loss and should never be used as an excuse to be negligent in our responsibilities to preserve the material objects of the human past.

The Year in Digital Humanities: 2012

In spite of apocalyptic predictions, 2013 is upon us. In honor of the new year, I present the (second annual) Year in Digital Humanities: the most notable stories in technology from 2012 and what it means for scholars in the Humanities.

1. Science captures the public’s imagination, with a little help.
 
Space missions have long been the subject for public consumption and entertainment, ever since the iconic footage of Neil Armstrong’s first steps on the moon. Armstrong sadly left us this year, but exploration of the solar system continues to grab public imagination as proven by NASA’s Curiosity rover. The car-sized rover touched down on Mars on August 6 and continued to accomplish a number of firsts on the Red Planet: the first streaming of a human voice from the surface of another planet, the first laser shot on Mars, taking its first Martial soil and rock samples, and even uncovering an ancient stream bed. Oh, and it was the first foursquare check-in on another planet.
 
Back on Earth and among scientists, the biggest story of the yearof the decade? in decades?was the possible discovery of the Higgs Boson, that “missing” particle that sparked the massive project over at CERN. Few of us outside the Physics department have the training to really understand why this thing is so important, but even fewer can deny that it is. That’s because in addition to the discovery itself, the marketing campaign that has surrounded this elusive little thingfrom artists’ representations to talk-show circuits to award-winning books and miniseries—has grown into an accomplishment all its own.
 
What it means for digital humanists: All this inspires a natural question: if physics can advocate so well for their research programs while educating the public on the relevance of their vastly complex discipline, why can’t humanists do the same? I’m thinking something like “the God particle” but for Foucault. Take it as a challenge for 2013.
 
2. Social media showcases some unexpected winners and losers.
 
Winners:
 
Pinterest: After being named the Best New Startup of 2011, the site boomed in 2012, becoming the fourth-largest traffic driver in the worldreferring more business in January than LinkedIn, Youtube and Google+.
 
Instagram: In April the photo-sharing app was bought by Facebook for $1 billion. In August, Instagram topped twitter for daily active users for the first time and reached 80 million users and counting.
 
Anonymous: After its largest attack yet in January against SOPA supporters, the Anonymous Group was named ‘Most Influential Person’ by Time magazine.
 
Barack Obama: President Obama scored two historically viral posts in 2012: His response to Clint Eastwood’s infamous “chair” speech (“This seat’s taken”) was the most RT’d tweet of the RNC, and his victory Facebook post (“Four more years”) was the most liked post ever with over 4 million likes.
 
 
 
Losers:
 
Google+: In February, the Wall Street Journal reported that the average person spent 3.3 minutes on G+ compared to 7.5 hours on Facebook per month. The People’s Republic of China took advantage of lax G+ censorship and began posting off-topic comments on Barack Obama’s official election campaign pages. In April, G+ shut down its photo editing site Picnik, and during the month of June, it was reported that 30% of users who make a public post on Google+ never make a second one. Ouch.
 
Apple: Apple suffered badly from the fiasco surrounding its Maps appincluding egregious inaccuracies. Samsung’s Salaxy S3 rivaled the iPhone 5 in both hype and sales.
 
Call It Tied:
 
Facebook: While Facebook gained its 1 billionth member in September, its stock price fell to $21.83 a month later.
 
KONY 2012: When it was released in March, the video receives more than 87 million online views. But after a wave of controversy, the follow-up video is released and receives half a million views on Youtube within a week. It still doesn’t reach the top spot in the contest for the most watched video in Youtube history: That honor goes to PSY “Gangnam style” with 1 billion views and counting.
 
What it all means for digital humanists: The already-dubious offline/online dichotomy became veritably irrelevant in 2012. Virtual worlds develop simultaneously with “real world” eventsso much so that the analytic distinction between the two has become incoherent. Technology has become public while the public has become virtualized.
 
Also, Google and Apple are not indefatigable.
 
3. UC logo snafu proves that people really, really care about design.
 
Shortly after University of California showcased its new logo, masses of outraged students and alumni rallied in protest. Nearly 55,000 people signed a petition saying “the logo was overly corporate, resembledamong other thingsa fruit label and did not sufficiently reflect the university’s prestige." The redesign was lambasted in comments and memes that circulated broadly on social media.

What it means for digital humanists: Besides a lesson in bad design, the logo snafu proved that when students mobilize via social media, the University must take notice. Unfortunately such mass mobilizations tend to occur over aesthetics more than politics:
 

"It's good that UC is listening to us," said Connor Landgraf, student body president at UC Berkeley. "Hopefully they'll start listening to students on other issues, as well, such as tuition increases."

 
See a story that I missed? Add it in the comments!

 

All Mimsy were the Borogoves: A Brief Introduction to the Unicode Standard

You can’t spell “digital humanities” without letters, and you can’t make letters appear on a computer screen without character encodings. The ubiquity of character encodings, and the enormity of the challenges involved in creating and standardizing them, are (happily) obscured by the fact that, when done well, they are not seen at all. It is when they break down, when a beloved paradox from an ancient text, 名可名非常名 (Laozi 1), turns into a string of unintelligible jabberhÑOvqÝŸ/PÿQ¶úðͶŒ ¡éØc:¹aðthat the issue demands your attention.

The intelligibility of an electronic text rests on its encoding. Since Unicode's inception in 1987, the project has endeavored to create an encoding system capable of including all the world’s written languages, past and present, in a single, standardized format. Their mantra: “a unique number for every character, no matter what the platform, no matter what the program, no matter what the language" (http://www.unicode.org/standard/WhatIsUnicode.html).

Before Unicode, the standard was ASCII (The American Standard Code for Information Interchange), which was developed because computers also need standard character sets in order to use the same programs. In 1963, the ASCII character set was limited by hardware capabilities to 128 characters (2^7 or 7 bits), which were based on English (http://edition.cnn.com/TECH/computing/9907/06/1963.idg/index.html).

This standardization allowed for easier communication between computers and in English, but the 128-large character set of ASCII was too limited to encompass even French with diacritics, not to mention Arabic, Braille, Sanskrit, or mathematical notation. Some were later added to the standard as hardware limitations relaxed, by the creation of alternative 128-character sets (using an eighth bit to create a full byte, that is 2^8=256 possibilities). But this led to a proliferation of separate, mutually unintelligible extended sets. And with its 50,000+ large character set, the Chinese, Japanese, and Korean (CJK) group of written languages presented an encoding challenge at an order of magnitude higher than the others. The eight bits of extended ASCII would not suffice.

Indeed, it was while developing a Japanese Kanji-enabled Macintosh computer in 1985 that Unicode President and co-founder Mark Davis first realized the need for a much larger, comprehensive encoding standard. In 1987, Davis met with researchers from Xerox who were doing work on multilingual character encoding. He joined with two of them, Joe Becker and Lee Collins, and together the three  would begin the Unicode project (http://www.unicode.org/history/earlyyears.html). In 1991, the Unicode Consortium was officially incorporated (ibid), and in 1993 the Unicode standard replaced ASCII for the first time in an operating system, Windows NT version 3.1 (http://support.microsoft.com/kb/99884).

As is well known, advances in hardware have meant that the memory allocation problem that limited ASCII to 7 bits in 1963 is, thankfully, quite moot. Unicode was developed as a 16-bit standard (UTF-16), which allows for 65,536 unique code-points (without the need for extension into other “planes”). The standard also includes an variable-length 8-bit encoding (UTF-8) and an extended 32-bit encoding (UTF-32). Today, systems based on Windows NT (e.g. XP, Vista, Windows 7) and Mac OS X use the 16-bit standard (UTF-16), and many UNIX-based systems and a majority of websites use UTF-8 (http://trends.builtwith.com/encoding/UTF-8).

Now, with hardware limitations no longer an issue, Unicode offers a practical and comprehensive character encoding standard.

“The majority of common-use characters fit into the first 64K code points, an area of the codespace that is called the basic multilingual plane, or BMP for short. There are sixteen other supplementary planes available for encoding other characters, with currently over 860,000 unused code points. More characters are under consideration for addition to future versions of the standard” (http://www.unicode.org/standard/principles.html).

In other words, there is plenty of space in the Unicode standard to handle all of the world’s written languages.

The Unicode Consortium has brought all the major languages written today, and many less-common and ancient ones, into a single standard, thus allowing humanities researchers in many fields and areas of study to read electronic texts as though they weren't just strings of ones and zeros, blissfully unaware of the jabberwocky behind the screen.

 

Black Studies and Digital Humanities: Perils and Promise

 “When you hear the terms ‘new technology’, ‘digital humanities’, and ‘black studies,’” tells Duke Professor Mark Anthony Neal on his web series Left of Black, “it’s almost as if its an oxy-moron.” 

Neal is Professor Black Popular Culture in the Department of African and African-American Studies at Duke University and the host of a weekly webcast, Left of Black. On the season premier of its third season, Left of Black tackled the uncertain relationship between Black Studies and digital humanities – a relationship that is at once precarious and filled with potential. The conversation was enriched by the contributions of two young scholars who are charting new terrain in the field of Black Studies and digital humanities: Howard Rambsy II, Associate Professor of English Language and Literature and Director of the Black Studies Program at Southern Illinois University at Edwardsville, and Jessica Marie Johnson, a Post-Doctoral Fellow in the Richards Civil War Era Center and African Research Center at Penn State University.

According to these experts, digital humanities present Black Studies scholars – as well as other scholars studying the experiences of people of color – with a double bind. One the hand, some Black Studies departments have been resistant to embrace the possibilities of emerging technologies, new media, and digital platforms in their work. University departments everywhere tend towards a kind of institutional inertia that, ironically, encourages fierce competition for the cutting edge in research while maintaining and defending “traditional” structures like tenure, disciplinary boundaries, and the traditional academic publication system. As Neal puts it, “on the one hand, if we look at Black Studies proper, the fact is it’s a field that continues to be driven by older scholars [who are] still very much tied to a 1960s style Black Studies model.” In a demonstrative moment of generational critique 2.0, Neal diagnoses the problem: “These folks are still on the listserve.” 

On the other hand, Universities are often slow to recognize the ways in which “race” factors in the digital humanities, while ignoring or sequestering the distinctive digital humanities projects created by Black Studies scholars.

“When all these deans and provosts are looking around for folks who are going to be doing the cutting edge work [in digital humanities], the last folks they look for are black folks. Black folks don’t do technology, right?” Neal explains.

We have to move “beyond normative ideas of who is a digital humanities scholar,” says Johnson, “which has been imagined as a white, male academic. And it’s really not. There are ways that we as people of color and people of all kinds of identities are very engaged with what’s happening in technology right now, but are not necessarily having that conversation in the digital humanities space.”

Not only must Black Studies scholars working in digital humanities confront the stereotype that African-Americans are uninterested or unskilled in technology, they must also continue the struggle that Black Studies scholars have felt for decades: educating the public on the relevance, importance, and centrality of Black Studies in the larger liberal arts paradigm. (And if there are still doubts that this struggle continues, this recent Chronicle of Higher Education controversy should lay those doubts to rest.)

“Even when you have folks who do [digital humanities], they’re never included in those conversations, because – let me just be frank – it’s just ‘black shit,’” Neal says. “It’s never integrated into what the folks do in the mainstream University, even as the mainstream university is going forward with this idea of digital humanities.”

Part of the solution, according to Johnson, is to highlight, promote, and celebrate the contributions to digital humanities research by Black Studies scholars and people of color. Projects such as the Black Gotham Archive and Diaspora Hypertext are fine examples. In the realm of pedagogy, Black Girls Code confronts the dearth of African-American women in science, technology, and math while transforming how we think about “access” to new technologies. In the realm of cultural production, The Mis-Adventures of Awkward Black Girl is transforming not only “Black television” but the entire universe of television in a digital age.

According to Neal, digital humanists in Black Studies are “educating these institutions that, not only are we using this technology in very interesting and productive ways, but… we’re actually using it in cutting edge ways that has a great deal to teach the university about how to do digital humanities.”

That being said, it is not the sole responsibility of Black Studies scholars to demonstrate the relevance of race to digital humanities. We must all do our part. Indeed, to the extent that digital humanities research tells us something about how society interfaces with technology, an appreciation of racial dynamics in digital spaces is fundamental to our understanding of digital humanities. 

As Johnson explains, the same structures we find in the print world get replicated in the digital realm. Even though most people recognize the potential to create Internet spaces that are more egalitarian, open, and diverse, the reality is that structures of race, gender, sexuality and class operate there, too.  

“This is where black studies can enter the digital humanities realm in a critical way,” says Johnson. Black Studies is particularly qualified to make scholarly interventions into this area by virtue of its historically strong critical apparatus vis-à-vis these structures. But all of us interested in digital humanities should be charged with recognizing and appreciating the race, gender, class, and other power structures that shape these tools and spaces. Just as the “archive” is now recognized to be problematic with regards to race and gender, so should we be critical in our usage of technology, especially as the Internet becomes more and more commercialized.

This is not to say that digital tools and spaces are useless to Black Studies scholars and those working to interrogate power structures in our society. Indeed, all the contributors to the discussion on Left of Black acknowledge the potential of the Internet and new media for new knowledge production and critique.  

The challenge Black Studies scholars face, according to Howard, is the same one facing so many in the digital world: how to produce quality content that is centralized enough to provide a cumulative critical apparatus, as opposed to a flurry of unorganized memes. As Neal puts it: “The irony is that there was a real critical apparatus 50 years ago when folks didn’t have access to publishing houses they way we do, no internet. Now we have access, we can publish anything we want at any time, but yet we find ourselves hamstrung to actually speak back critically at what we’re producing.”

Considering the impressive history of prolific publishing and content creation by their forbearers in the analog world, digital humanists in Black Studies have big shoes to fill. As they continue to innovate in digital humanities, scholars outside of Black Studies will have much to learn.

 

Dan Cohen Brings Tech Edge to Berkeley Humanities

We’re now entering the third decade of the Web. And yet, according to historian and digital humanist Dan Cohen, many scholars still don’t grasp the full potential of digital tools. Unfortunately, the humanities and interpretative social scientists are the furthest behind. These scholars may see digital technologies as a way to data crunch mass amounts of text in a form of “distant reading,” a distribution warehouse for electronic articles, or a new-age mailbox. But too few are engaging the Web and information technologies as a platform that fosters new ways of thinking, teaching, and building knowledge.

 
Perhaps no other scholar in the humanities or social sciences is doing more to rectify this situation than Professor Dan Cohen. An internationally recognized leader in digital humanities, he has done a tremendous amount to build the infrastructure of new technology for the humanities and in history in particular. And luckily for us, he is bringing his digital crusade here to Berkeley as an Avenali Residential Fellow in the Department of History during the spring 2013 semester.
 
A bridge between the two worlds of academia and tech, Mr. Cohen is a testament to the belief that programmers and humanists can dialogue, can reinforce one another’s projects, and can work together to build a better University. Academically, Mr. Cohen is an expert in nineteenth century religious and scientific thought. But he is perhaps best known for his feats in digital humanities as a leader in software development, in digital pedagogy and in thinking programmatically about the impact of new technology on research in the humanities and social sciences.
 
Heralded as one of the 12 tech innovators who are transforming campuses, Cohen started developing skills in digital humanities while working with digital-history pioneer Roy Rosenzweig to undertake a project aimed at historically documenting digital-born objects like giant databases. After 9/11, Mr. Cohen and his colleagues were the first to undertake a major documentation project of the attacks, leading to the September 11 Digital Archive, a topic we have covered here at the THL Lab Blog. An unprecedented achievement, the Library of Congress now archives the collection as its first major digital acquisition.
 
Mr. Cohen deserves most of the credit for transforming George Mason’s Center for History and New Media from a skeletal program to an internationally respected leader in the digital humanities field, one that manages over 100 Web projects and reaches16 million people. It is there that some of Mr. Cohen’s most influential projects came to the fore.
 
Funded by a major Mellon Foundations grant, Mr. Cohen developed Zotero, the principle open-access bibliographic and source management system and the prized tool of students everywhere preparing for preliminary examinations. Mr. Cohen has also founded The Humanities and Technology (THAT) Camp, an innovative gathering that stands at the forefront of digital humanities innovation. His books on digital history and on the academic use of blogs – approaches to the vernacular web as he calls it – are regarded as leading manifestos for a new era of research communication and controversial challenges to traditional scholarly communication.
 
Mr. Cohen himself is no stranger to tradition. Trained at Princeton, Harvard, and Yale, he is in a good position to speak on the inertia that keeps traditional forms of scholarly communication dominant long after their usefulness and potency have waned. And he bucks the “traditional” scholarly life by keeping a prolific online presence, with a fascinating digital humanities blog and more than 7.000 Twitter followers.
 
Mr. Cohen’s visit to UC Berkeley will build on several smaller-scale but successful projects that our University has undertaken in digital humanities. In the Department of History, where Mr. Cohen will be hosted, graduate students have organized a series of lectures and workshops on digital tools and materials that strengthened the research and teaching skills of participants. The Townsend Humanities Lab and the Center for New Media have also been leading the development of digital humanities on campus. Mr. Cohen’s residency, which will involve several lectures, seminars, and workshops across campus, is an exciting contribution to these projects.
 
To learn more about the Avenali Residential Fellowship, see the Townsend Center’s website.  

In Defense of Browsing: Digital Humanities and the Upshot of Screwing Around

A study from the Pew Internet and Life Project on Friday reported that 53% of those 18-29 years old go online "for no particular reason except to have fun or to pass the time." Furthermore, the study highlighted a significant generation gap. Just 12% of those 65 or older said they go online for no particular reason.

The randomness that guides the activity of browsing is not confined to the web; it also seems to be shaping digital humanities. In a recent column Stanley Fish portrays browsing as the dominant method (or anti-method) in digital humanities: randomly traversing huge amounts of data with the hope of stumbling upon some surprising statistical pattern of sameness or difference undetectable by the eye of the human reader. In contrast to the “conventional” humanities project in which the scholar approaches a close reading of a text with some hypothesis in mind, the computer-aided process of “text mining” is neither interpretively directed nor theoretically guided.

The process might look something like this: Let’s see how many time the word "orange" is used versus "clementine" in this 10-million-book corpus. If a significant pattern arises, a hypothesis follows. In other words, the practice is dictated by the tool; millions of texts are analyzed together in a panoramic-like practice of “distant reading.”

It should be clear by now that if digital humanities are coterminous with this “text-mining” or “distance reading” orgy of randomness and technological fetishism, then neither Stanley Fish nor many other established humanists look upon it as a particularly positive development in their discipline. Fish concludes, “whatever vision of the digital humanities is proclaimed, it will have little place for the likes of me and for the kind of criticism I practice… a criticism that insists on the distinction between the true and the false, between what is relevant and what is noise, between what is serious and what is mere play."

But the phrase “mere play” begs the question. There exists a significant group of people for whom play is very serious indeed, namely experts in child development and psychology who contribute to the growing consensus that “idle, creative, unstructured free play… is a central part of neurological growth and development.”

Play has also enjoyed a value boost in the world of adults in recent decades, and humanist scholars in particular have touted its productive potential. From Jacques Derrida’s “free Play” and Herald Bloom’s “productive misreading, to Barthe’s discussion of “the pleasure of the text”, play is not just a pathway for distraction and procrastination; it is vital for creativity and critique. 

Play can take a variety of productive forms, but perhaps the most important – at least for current debate around digital humanities - is the play involved in the process of “screwing around”. Stephen Ramsey describes this using a library analogy: One can enter a library in order to conduct a specific search about a particular topic. Alternatively, one can go into a library and “wander around in a state of insouciant boredom”. This is called browsing, and it’s a completely different activity. Unlike searching, when I am browsing I don’t know what’s here, and I don’t know what I’m looking for.

So perhaps the appropriate dichotomy in the debate over digital humanities is not between close reading and far reading, or interpretation and text-mining, but rather between searching and browsing. Searching is approaching the text “armed with a hypothesis”. Browsing uses “a machine that is ready to recognize the text in a thousand different ways instantly”. As Fish puts it:

Each reorganization (sometimes called a “deformation”) creates a new text that can be reorganized in turn and each new text raises new questions that can be pursued to the point where still newer questions emerge. The point is not to get to a place you had in mind and then stop; the point is to keep on going, as, aided by the data-generating machine, you notice this and then notice that which suggests something else and so an, ad infinitum.

Are we ready, Ramsey asks, “to accept surfing and stumbling — screwing around, broadly understood — as a research methodology?”

Criticisms of the browsing method are easy to find. For instance, some denounce this approach by arguing that important treasures cannot be found without a map of sorts; randomness will lead to bunk. But this seems fundamentally counterintuitive to anybody who has practiced the art of browsing on the Web, or even in a book or record store. I dare say I have discovered many of my favorite musical artists, my most useful skills, and my most passionate interests in the course of “screwing around”. I have also consumed a wealth of fascinating, if not entirely “practical” knowledge; this afternoon included the wondrous features of the cuttlefish. The beauty of “the accidental finding” is something that humanist scholars should embrace as they would the value of interdisciplinarity, the participant observation, and the openness that so often results in creative thinking. 

Others argue that the browsing – and particularly web browsing – is mired by anarchical content structures that prevent the development of community or cumulative knowledge. This criticism, however, seem to ignore the remarkable order that exists within the so-called anarchy of the web and digital technologies.

Yes, “screwing around” on the internet does not follow as narrow a guide as peer-reviewed journal or MLA bibliography, but chaos it is not. Indeed there is a sophisticated and complex order to web interactions that produce trends, vocabularies, and communities, even though these may be unpredictable. It is worth remembering that today, the dominant format of the Web is not a random assortment of millions of static “Pages” but a system of aggregate, browse-amenable forums: Reddit, Tumblr, Facebook, Youtube, and others. As Ramsey puts it: “These sites are at once the product of screwing around and the social network that invariably results when people screw with each other.”

Finally, the most pervasive criticism does not concern the form of textual interaction that browsing represents but the “texts” themselves that browsing tends to encounter and encourage. Columnist Mitch Albom, in a somewhat curmudgeonly Sunday column reacting to the PEW poll, writes that the kind of aimless activity that young people perform when online has led to the proliferation of mindless content: "When you're not looking for anything special, the un-special will do just fine," he writes. But I wonder exactly what content Ablom is referring to here: is it the straw-men of banal haranguing or scatological humor that has populated every media form since stone carvings? Or is the specifically internet-esque system of content creation, interactions and iterations – think the recent “Shit Girls Say” video and its derivatives, or the pepper spray meme – that seem particularly “mindless” to Ablom? Is “mindless” without thought or consideration, or is simply without restraint?

Indeed, it is difficult as a young person to read these now-cliché denouncements of web browsing without feeling put down or alienated at the thinly veiled ageism that underlies such critiques. Take these lines for instance:

I have long since believed that going into cyberspace is a mission young people take not to actually land on a planet, but to cruise around the stars until the ship runs out of gas.

They could, of course, discover a previously-unknown galaxy of wonder that could change our world forever.

Unfortunately, the rise of the semantic and integrated web may signal the decline of browsing as we know it. A vital part of “screwing around” is the very possibility of stumbling upon a piece of content that is outside one’s informational “normal”. That is, a crucial element of web browsing is the potential for exposure to weird content -- that is, content you did not search for, content you could not have imagined existed, content that is perhaps five or six degrees of “link” separation away from a first-order facebook or googlereader browse. But with the rise of user-specific content display – for instance Google’s new plan to track users across services - we may soon find ourselves surrounded by the ordinary. Content will now be tailored to us (or more specifically who the machine thinks we are) eliminating the random potential that has made the web scary, and amazing.

Scholars such as Stanley Fish would do well to appreciate how much browsing, randomness, and “screwing around” has benefited his own work and that of others. After all, how did I stumble upon his column – which then became inspiration for this blog post? Well, I was screwing around on the internet. 

The Year in Digital Humanities

2011 was a year of enormous change in computing, technology, and online. In honor of the New Year, I present the most notable stories in technology from 2011 (and what it means for scholars in the Humanities). See you in 2012!
 
1. Berkeley Students, Staff, and Faculty Get Free Stuff
 
In 2011, Operational Excellence sponsored a Productivity Suite project that included campus-wide distributions of Adobe software, Microsoft Office, and (soon) Google Email and Calendar Solutions.
 
What This Means for Digital Humanists: If you’re part of the Berkeley community, you just received thousands of dollars worth of software for free. The lesson here is to never, ever ignore emails from Operational Excellence.
 
2. Political Discourse Gets Meme-d
 
Memes are ideas on the internet that go viral. The editors of Know Your Meme, a Web site dedicated to tracking this sort of thing, singled out two of their favorites: Rebecca Black’s so-bad-its-kinda-good YouTube video “Friday,” and the remixed images of a police officer spraying UC protestors. Never to be outshined, Facebook has its own catalogue of what people were talking about in 2011. The Death of Osama bin Laden was number one, followed by Packers win the Super Bowl. An eclectic mix, indeed.
 
What it Means for Digital Humanists: Memes are one of the most important facets of internet archiving. Plus our very own director Celeste Langan starred in one particularly important meme – demonstrating (very close to home) what happens when videos go viral.
 
3. Intellectual Property Trump Civil Liberties
 
In terms of online freedoms, SOPA made headlines, but 2011 saw a trend of legislatures  — Democrats and Republicans alike — turned a blind eye to important civil liberties issues, including Patriot Act reform, and instead paid heed to the content industry’s desires to stop piracy.
 
What it means for Digital Humanists: To the extent that digital humanists care about privacy and access to information, this is a pretty big deal.
 
4. Secret Tech Suddenly Not So Secret Anymore.
 
We all know we wouldn’t have the internet if it wasn’t for the military (and the pornography industry), but 2011 demonstrated quite viscerally the link between technology and warfare.
 
Israel used a computer worm to sabotage Iran’s nuclear development. Drones used extensively in various combat missions (including the death of Osama Bin Laden). The hacker group Anonymous continued their online mayhem targeting everyone from the Tunisian government to BART personnel.
 
Even the US government is not safe from cyber warfare. In the past two months alone we've seen attacks against the US Chamber of CommerceLandsat-7 and Terra-AM1 satellitesvarious chemical companies, and more. A report called Foreign Spies Stealing US Economic Secrets in Cyberspace published by the Office of the National Counterintelligence Executive this October accused Chinese hackers of being "the world's most active and and persistent perpetrators of economic espionage." 
 
What It Means for Digital Humanists: Expect more delays on BART.
 
5. The Rise of the Tablet.
 
The trickle of tablets began in 2010 with the iPad and the Galaxy Tab, but in 2011 the floodgates opened. Get yours for as low as $99.
 
What It Means for Digital Humanists: A book I downloaded on Amazon says the rise of tablets signal one more nail in the coffin for real live book stores. By the way - What’s a book store?
 
6. It's All Mobile
 
The mobile web is growing faster than the desktop one, and becoming the focus of technology innovation. Taken together, 2010 and 2011 have been the years of fastest growth ever in the smartphone market, at 72% and 50% respectively.
 
What It Means for Digital Humanists: The growth of the mobile market goes hand-in-hand with more and better apps, higher quality cell phone cameras, and internet connectivity for millions of people around the world. If anything, fieldwork is not what it used to be.
 
7. The Death of Flash and the Rise of HTML5
 
If you have an ipad, you’re probably wondering what Flash is. Apple doesn’t support Flash, nor does BlackBerry or Windows Phone 7. Flash was necessary once, but that was then, and HTML5 is now – or at least will be in 2012. Even Adobe has given in, saying, "HTML5 is now universally supported on major mobile devices, in some cases exclusively. This makes HTML5 the best solution for creating and deploying content in the browser across mobile platforms. We are excited about this." Can't you just hear the joy in their voice?
 
What It Means for Digital Humanists: Pretty soon, you will no longer have to deal with that annoying, broken lego icon. In the long run, the rise of HTML5 may have profound implications on how we interact with online spaces, including the development of Web 3.0.
 
8. The Death of Steve Jobs
 
Love him or hate him, Steve Jobs had an enormous impact on our lives. He will be missed. Before he left us, Jobs made one last appearance on the Apple keynote stage to introduce iCloud, which he called the culmination of a ten-year journey to "get rid of the file system."
 
What It Means for Digital Humanists: Someday, you will have a clean desktop (both wood and digital).