“Toward Environmentally Sustainable Digital Preservation”

I’ve been working on a paper with Keith Pendergrass, Tim Walsh and Laura Alagna that centers on the environmental impact of digital preservation, and it is now out in the Spring issue of The American Archivist.

This effort began at the BitCurator Users Forum 2017, where I heard Keith initially present on this subject. I’m very happy we’ve put together a longer work and I look forward to getting it in front of a larger audience. While other criteria for appraisal and retention of digital content have received significant consideration, environmental concerns have not yet factored into much of the discussion.

I will be discussing this and similar topics with fellow panelists at SAA 2019 – swing by Archivists Facing a Changing Climate if you’re in Austin this year!

Toward Environmentally Sustainable Digital Preservation (American Archivist) (CU Scholar) (Harvard)

 

Book Out: The No-nonsense Guide to Born-Digital Content

9781783301959

I have a new book out with my colleague Heather Ryan, The No-nonsense Guide to Born-Digital Content

I started drafting chapters for this book in late 2016 when Heather, then the head of the Archives here and now director of the department, approached me about coauthoring the title. I had never written in chapter form before, nor for more a general audience. Approaching my usual stomping ground of born-digital collection material from this vantage was really intriguing, so I jumped at the chance.

To back up a little, our subject here is collecting, receiving, processing, describing and otherwise taking care of born-digital content for cultural heritage institutions. With that scope, we have oriented this book to students and instructors, as well as current practitioners who are aiming to begin or improve their existing born-digital strategy. We’ve included lots of real world examples to demonstrate points, and the whole of the book is designed to cover all aspects of managing born-digital content. We really discuss everything from collecting policy and forensic acquisition to grabbing social media content and designing workflows. In other words, I’m hoping this provides a fantastic overview of the current field of practice.

Our title is part of Facet Publishing’s No-nonsense series, which provides an ongoing run of books on topics in information science. Facet in general is a great publisher in this space (if you haven’t checked out Adrian Brown’s Archiving Websites, I recommend it), and I’m happy to be a part of it. I thank them for their interest in the book and their immense help in getting it published!

Update: The book is now available stateside in the ALA store.

“Aggregating Temporal Forensic Data Across Archival Digital Media”

Last year I attended the Digital Heritage 2015 conference and presented a paper on digital forensics in the archive. The paper centers on collecting file timestamps across floppy disks into a single timeline to increase intellectual control over the material and to explore the utility of such a timeline for a researcher using the collection.

As I state in the paper, temporal forensic data likely constitutes the majority of forensic information acquired in archival settings, and in most cases this information is gathered inherently through the generation of a disk image  While we may expect further use of this data as disk images make their way to researchers as archival objects (and the community’s software, institutional policies and user expectations grow to support it), it is not too soon to explore how temporal forensic data can be used to support discovery and description, particularly in the case of collections with a significant number of digital media.

Many thanks to the organizers of Digital Heritage 2015 for the support and feedback; it was a wonderful and very wide-reaching conference.

Aggregating Temporal Forensic Data Across Archival Digital Media (IEEEXplore) (CU Scholar)

Repercussions of Amassed Data

I had the pleasure of meeting Mél Hogan while she was doing her postdoctoral work at CU Boulder. I think her research area is vital, though it’s difficult to summarize. But that won’t stop me, so here goes: investigating how one can “account for the ways in which the perceived immateriality and weightlessness of our data is in fact with immense humanistic, environmental, political, and ethical repercussions” (The Archive as Dumpster).

Data flows and water woes: The Utah Data Center is a good entry point for this line of inquiry. The article explores the above quoted concerns (humanistic, environmental, political, and ethical) at the NSA’s Utah Data Center, near Bluffdale. It has suffered outages and other operational setbacks since construction. These initial failures are themselves illuminating, but even assuming such disruptions are minimized in the future, the following excerpt clarifies a few of the material constraints of the effort:

Once restored, the expected yearly maintenance bill, including water, is to be $20 million (Berkes, 2013). According to The Salt Lake Tribune, Bluffdale struck a deal with the NSA, which remains in effect until 2021; the city sold water at rates below the state average in exchange for the promise of economic growth that the new waterlines paid for by the NSA would purportedly bring to the area (Carlisle, 2014; McMillan, 2014). The volume of water required to propel the surveillance machine also invariably points to the center’s infrastructural precarity. Not only is this kind of water consumption unsustainable, but the NSA’s dependence on it renders its facilities vulnerable at a juncture at which the digital, ephemeral, and cloud-like qualities are literally brought back down to earth. Because the Utah Data Center plans to draw on water provided by the Jordan Valley River Conservancy District, activists hope that a state law can be passed banning this partnership (Wolverton, 2014), thus disabling the center’s activities.

As hinted at in a previous post on Lanier, I often encounter a sort of breathlessness invoked when descriptions of cloud-based reserves of data and computational prowess are discussed. Reflecting on the material conditions of these operations, as well as their inevitable failures and inefficiencies (e.g. the apparently beleaguered Twitter archive at the Library of Congress, though I would be more interested in learning about the constraints and stratagems of private operations) is a wise counterbalance that can help refocus discussions on the humanistic repercussions of such operations. And to be sure, I would not exclude archives from that scrutiny.

Disk Imaging Workflow at BitCurator.net

Early in January I attended the first-ever BitCurator Users Forum in Chapel Hill. This was a fantastic day with a group of folks interested in the BitCurator project and digital forensics in an archive setting — definitely one of the most information-packed and directly applicable conferences or forums I’ve attended. I’m very much looking forward to next year’s.

I have a post on the BitCurator site on the disk imaging workflow I’m using with students presently, and there’s a great wrap-up of the day as well.

Checksumming till the cows come home

Jon Ippolito, from an interview with Trevor Owens at The Signal:

Two files with different passages of 1s and 0s automatically have different checksums but may still offer the same experience; for example, two copies of a digitized film may differ by a few frames but look identical to the human eye. The point of digitizing a Stanley Kubrick film isn’t to create a new mathematical artifact with its own unchanging properties, but to capture for future generations the experience us old timers had of watching his cinematic genius in celluloid. As a custodian of culture, my job isn’t to ensure my DVD of A Clockwork Orange is faithful to some technician’s choices when digitizing the film; it’s to ensure it’s faithful to Kubrick’s choices as a filmmaker.

Further:

As in nearly all storage-based solutions, fixity does little to help capture context.  We can run checksums on the Riverside “King Lear” till the cows come home, and it still won’t tell us that boys played women’s parts, or that Elizabethan actors spoke with rounded vowels that sound more like a contemporary American accent than the King’s English, or how each generation of performers has drawn on the previous for inspiration. Even on a manuscript level, a checksum will only validate one of many variations of a text that was in reality constantly mutating and evolving.

In my own preoccupation with disk imaging, generating checksums and storing them on servers, I forget that at best this is the very beginning of preservation; not an incontestable “ground truth” of the artifact.

Simulation Fever

From Persuasive Games by Ian Bogost:

Previously, I have argued that videogames represent in the gap between procedural representation and individual subjectivity. The disparity between the simulation and the player’s understanding of the source system it models creates a crisis in the player; I named this crisis simulation fever, a madness through which an interrogation of the rules that drive both systems begins. The vertigo of this fever — one gets simsick as he might get seasick — motivates criticism.

Procedural rhetoric also produces simulation fever. It motivates a player to address the logic of a situation in general, and the point at which it breaks and gives way to a new situation in particular.