Posted on Thursday, December 11th, 02008 by Kevin Kelly
Digital continuity is a real problem. Digital information is very easy to copy within short periods of time, but very difficult to copy over long periods of time. That is, it is very easy to make lots of copies now, but very difficult to get the data to copy over a century. For two reasons:

1) Formats change. Because of rapid technological evolution the “language” which one storage media speaks can become obsolete (incomprehensible) in only a few years. Or the hardware that speaks that language becomes so rare, it cannot be accessed. Who can read the data on ten-year old floppy disks?

2) The storage medium itself can decay. Turns out that paper is much more stable over the long term than most digital media. Magnetic surfaces flake, peel, shatter. And the supposed durable CDs and DVDs aren’t very stable either.

As an example of the latter, here’s New York Time’s tech guru David Pogue lamenting the unadvertised short lifespans of homemade DVDs:

I’ve got all of the original iMovie projects backed up on DVD, in clear cases, neatly arrayed in a drawer next to my desk.

Guess what? On the Mac I use for video editing, most of the DVD’s were unreadable. They’re less than four years old!

I know, of course, that home-burned DVD’s, which rely on organic dye that deteriorates with time, are nowhere near as long-lived as commercially pressed discs. But man. Four years? Scared the bejeezus out of me.

OK, listen up people!

The only way to archive digital information is to keep it moving. I call this movage instead of storage. Proper movage means transferring the material to current platforms on a regular basis — that is, before the old platform completely dies, and it becomes hard to do. This movic rythym of refreshing content should be as smooth as a respiratory cycle — in, out, in, out. Copy, move, copy, move.

In other words, anything you want moved to the future has to be given attention to keep it moving forward.

We don’t know what the natural movage respiration cycle is for digital media yet since it is still very new, but I suspect the cycle is much shorter than we think. I would guess it is 5 years. No matter what digital format you have your precious stored on, you should expect to move it onto new media in five years — and five years after that forever!

Move it, move it, move it.

  • Zane Selvans

    This is absolutely, desperately true, and it’s why the data-silo businesses are an unmitigated information preservation disaster. How do you get your precious data out of iPhoto or Flickr, or your legacy proprietary blogging software that you started using before WordPress or TypePad took off? A lot of “movage” is impossible without standardized data interchange formats, but a lot of information-based businesses are built around the immovable silo model… not the least of which is academic publishing! This isn’t even necessarily about the format of the data itself, which is often standardized text, video, images, etc, but about all of the “glue” that holds the data together into come coherent, meaningful whole. The metadata – places, times, annotations, authorship, identity, and the associations between them all. There are a few beacons of hope, but it’s not clear they’ll prevail (e.g. OpenID, OAuth, IntenseDebate, XFN, and a profligate mess of other XML based data structures), because the organizations that currently contain the data have no incentive to support them.

    Digital Dark Age indeed.

  • Sid

    This problem is well known and solutions are being sought, tow of the more promising options are molecular storage and crystal storage.
    in molecular storage you take a molecule and make subtle changes, for example a long carbon nano tube….. this tube can be read along its length. use a normal C12 atom for say a zero and a C13 atom for 1.
    the second is crystal storage. here you grow a crystal and the patterns of deformities of the crystal make up your data.
    (think of the first as andromeda strain and the second as superman!)
    these methods will create storage that can last for eons.

  • Jonathan

    It’s nice to hear that there are some new data formats that are on the horizon, Sid. The problem remains, however, with formatting. Can you open a Word document from ten years ago? What about an image file from a thirty year old tape drive?

    Do you think anyone will know what a JPEG is in 10,000 years? .doc? .tiff? .psd? Etc.

    I’m more concerned about the data formatting problem then I am about the storage medium problem!

  • george

    @Jonathan: it is very likely that if a JPG image reaches anyone in 10,000 years, the JPG (and other file format) specs would be preserved alongside. the thing with digital is that data does not deteriorate as it’s copied, so effectively you can preserve a file indefinitely without requiring special conditions (esp. given movage). the reality, however, is that we’re expecting archival quality from a medium produced without such intention (the dvd).

    re data formatting, with virtual machines being everywhere nowadays i can’t see a problem reproducing a custom environment of any age and platform. the other thing is, there are recommendations for archival recordings and formats we should archive in.

    take an ubiquitous format as PDF for instance. in reality, we’re now able to open files created using a technology that’s more than 17 years old. – to give a brief example.

