Monday, July 24, 2006

Working in the Browser

Working with digital sources means that historians will spend more time working with computers. Most already use e-mail and word processing, search library catalogs and archival finding aids, read online journal articles and book reviews. Digital history puts primary and secondary sources onto the computer, too, and provides new opportunities for historical research and writing: blogging, digitization, web and wiki authoring, programming, and so on. With the shift toward Web 2.0, more and more of these activities can be done in the browser. You can work anywhere that you have a computer and an internet connection, and all of your notes and files can be stored on a server somewhere else. (For an exploration of some of the possibilities, see James Fallows' article "Homo Conexus" in the new issue of Tech Review.)

Until relatively recently, however, writing and citing have been done from within standalone applications like Microsoft Word and Endnote. In order to work on your book or article, you've had to have the applications installed on the computer that you're using. If you want to work on another machine, you need the programs installed on that machine, too. Depending on the software licenses involved, you might need to buy new copies of application software for every machine you want to work on. And, of course, you have to make sure that your files are synchronized across all of your machines, or, at the very least, that you are editing the latest version of each.

Researchers at the Center for History and New Media at George Mason University are working to change this paradigm by creating a tool they call Firefox Scholar. Built on top of the popular browser Firefox, the Scholar extensions "will allow researchers to recognize and capture metadata from online objects; collect documents, images, and citations from the web; and allow those materials to be sorted, annotated, and searched--all directly within their web browser window." That's pretty sweet. Best of all, it will be possible for scholar-hackers to extend the system themselves, and build new functionality into it. As Dan Cohen noted a few days ago in his blog, the beta version will be coming soon, and CHNM is still looking for people to test it.

Now that I have a working server set up, I've decided to take a page from Firefox Scholar and try building some data mining and visualization tools into the browser as extensions. My hope is that some of these may eventually find their way into Scholar, but in the meantime I'll learn some new techniques and, hopefully, create some tools I can use in my own research.

Putting work into the browser makes things easier for the user at the cost of increasing complexity for the programmer(s). Here's an overview of some of the components we will need to work with:
  • Firefox, an extensible browser, uses Javascript and XUL
  • Making things look good on the client side will require CSS and AJAX
  • Making things work on the server side will require a database (MySQL), a scripting language (PHP), a search engine toolkit (Plucene) and maybe some other goodies (like Cake)
  • We'll also need custom-made or off-the-shelf mining and visualization components
Tags: | | | | | | |

Monday, July 17, 2006

Frobisher: Search for Early Canadian History

My Digital History at Western server is now up, so I can host research tools. The first is an alpha version of Frobisher, a prototype that mines information from the online Dictionary of Canadian Biography to refine Google searches. The user starts by entering the name of a person. If there are multiple possibilities (e.g., MacDonald), he or she is asked to clarify which person is of interest. Since the coverage of the DCB is limited to people who died before 1930, the system performs best with pre-20th-century queries. This is very much still research software: it does quite well on some things (e.g., distinguishing Sir Alexander Mackenzie from Alexander Mackenzie) and not so well on others (e.g., turning up useful material about William Connolly). There is more information about the design of Frobisher in a series of posts on this blog. I welcome any and all feedback at wturkel@uwo.ca.

Tags: | | | |

Sunday, July 16, 2006

A Graduate Course in Digital History

One of my projects this summer is to put together a new graduate course on digital history (Hist513F), primarily for our public history students, but open to others. My goal is to provide a course that reflects the kind of work that I've been doing in this blog, while recognizing the fact that many of the students in the course may have never programmed before. Following the O'Reilly hacks model, I've come up with three categories of lab exercises: easy (no programming), medium (a bit of programming) and hard (a lot of programming). My hope is that students will be able to find something fun and challenging to do, whatever their level of expertise.

About a third of the seminar readings are drawn from multiple versions of GMU's "Clio Wired" course. The rest come from the blogosphere, online journals, and other sources. So far, I haven't made use of any gated resources, so the course should be accessible to anyone who wants to make use of it. If you have any suggestions or ideas, I'd be happy to hear from you at wturkel@uwo.ca.

Tags: | | |