I’ve been reading a series of posts by David Rosenthal on his blog analyzing the issue of format obsolescence. Traditionally, and at least in my education, format obsolescence has been treated as one of the great bugaboos of digital preservation. In response, a number of tools and resources have been developed focusing on format identification and validation (DROID, JHOVE, FITS, PRONOM and the upcoming UDFR to name a few prominent ones). Looking at the preservation landscape, it’s clear that format sustainability has been forefront in the collective effort.
Rosenthal however makes a convincing argument that this placement of effort is misguided, and is not providing the best ROI for the digital preservation community. I won’t repeat his arguments, except to say that Rosenthal places the format obsolescence issue in a historical context that suggests much has changed in computing since, and indicates other areas much overlooked (bit fixity, storage costs and hardware quality) that are shaping up as problematic indeed. Here’s a starter to his posts:
- Format Obsolescence: Scenarios – April 27, 2007
- Format Obsolescence: The Prostrate Cancer of Preservation – May 7, 2007
- Format Obsolescence: Right Here Right Now? – January 3, 2008
- Are Format Specifications Important for Preservation? – January 4, 2009
- Postel’s Law – January 15, 2009
That should get one started although there are many, many posts on the subject. Given those dates, I’m pretty late to the party, but I feel this is required reading for digital preservationists, agreement or no aside.
After a few reads you may be running for the nearest self-healing, mirrored ZFS volume, waking up in cold sweats and mumbling on about silent data corruption. Scary.