A couple of times in this blog, I've mentioned big history, an ambitious attempt to narrate history from the big bang to the present. Like microhistory, the Annales school, environmental history, and a few other thematic approaches to the discipline, one of the things that big history teaches us is that we can learn something different by judiciously manipulating the scale of our inquiry.
By providing us with access to completely new kinds of sources, digital history opens up some additional possibilities for manipulating scale. Consider, for example, the cached data provided by Google and other search engines. When you do a search you have the option of following the provided link, or of seeing what the page looked like when Google's spiders last visited. The date and time that the copy was cached is also provided, and it is straightforward to write a program to retrieve the current page and cached copy and compare them to see what has changed. As a test, I did a Google search for "digital history" on 2 Jan 2008 at 15:05 GMT and recorded the times that the cached copy had been created for each page on the first page of hits. Sorted by duration, the results were: 3 days 8 hours 14 minutes, 3d 10h 24m, 3d 15h 14m, 3d 15h 36m, 4d 14h 50m, 5d 3h 12m, 5d 5h 20m, 5d 8h 24m, and 257d 15h 52m.
Now suppose you wanted to write the history of a very brief interval, say a few hours, minutes or even seconds. In the past, this kind of history--I'm not sure what to call it--would only have been possible for an event like 9/11, the JFK assassination or D-Day. But with access to Google's cache data and some sophisticated data mining tools, it becomes possible to imagine creating rich snapshots of web activity over very short intervals. And to the extent that web activity tracks real world activity and can be used to make inferences about it, it becomes possible to imagine writing the history of one second on earth, or one millisecond, or one microsecond.
Tags: cache | data mining | scale | search