Tuesday, January 01, 2008

The Search Comes First

In December, I had a chance to visit humanists at a couple of universities in the Boston area and talk about digital history. One kind of question that came up repeatedly was foundational: What do humanists really need to know in order to be more effective online researchers? What should they learn first? What constitutes a baseline literacy? How can digital humanists be introduced into existing departments, or the techniques of digital humanities be added to existing curricula?

Since I started this blog two years ago and began teaching digital history classes, I've had the chance to revisit these questions a number of times. My original answer was that it all begins with search, and I think that still holds. For me, the essence of digital history is the shift to what Roy Rosenzweig called a "culture of abundance." The internet is unimaginably large and growing exponentially. Individual researchers, on the other hand, have a sharply bounded capacity to absorb or make sense of new material.

I think that a lot of historians are resistant to the idea of processing documents computationally, because they think of it as a challenge to, or supplement for, reading. Instead, computation should be seen as a way to augment human abilities. We still need human beings to read and interpret sources, and we must still train our students in traditional philological techniques. There's no getting around the fact, however, that the way that we find sources has drastically changed in the last ten or fifteen years.

According to Search Engine Watch, as of this past summer search engines worldwide were handling about 61 billion searches per month. More than half of these were handled by Google, making its ranking algorithms the most pervasive source of bias in the history of research. It's clear that humanists need to understand how search engines work, and need to be able to parameterize their searches to get the best results. Your ability to do a virtuoso close reading is irrelevant if you can't find the sources to read in the first place. Humanists who wish to place their own material online also need to understand search engine technology, because it is the deciding factor in whether a work can be found, read and cited.

In my conversations last month, the follow-up question was usually whether or not historians and other humanists will need to be able to program computers. I'm not sure about the answer to that. I'm certain that some of them will. The discipline of history is in for some interesting times, as interpretations backed by intensive research in a few archives will be confronted with those backed by machine learning or text mining of massive datasets. My hope is that we'll find a rapprochement... but then I'm an optimist.

Tags: | | | |