This is an idea I've been kicking around for a week or so and finally found the time to write down. I'm circulating this for comments and to see if anybody knows of any existing service like this or project underway of this nature. I'm not seriously thinking of doing this myself--I can think of endless administrative and legal unpleasantnesses, and I've just about used up my life's quota for those--but more to see if there's something I've overlooked which would make it impractical.
The Foresight Institute's CritSuite software goes a long way to graft one of the missing pieces of open hypertext onto the Web. While scanning my site for broken links to other sites and fixing them where possible, muttering about the ephemerality of information on the Web, I wondered whether it might be possible to provide another aspect of Xanadu as well with a retrofit of an organisational, not technological nature.
It's irritating having to fix broken links, but what really bothers me is reading a paper in Science or Nature in which the essential data (for example, the nucleotide sequence of the mutation responsible for a particular defective protein) have been banished to a footnote which gives a URL. How likely do you think it will that somebody reading that paper 20 years from now will be able to follow that link? Yet the whole purpose of the paper was to document the sequence.
Suppose somebody set up a Data Immortality Foundation or, viewed from the flip side, a "Data Cemetery"? This would be a non-profit foundation chartered in an innocuous and stable jurisdiction (such as the Channel Islands), which operated a number of well-connected and redundant Web servers on every continent; the original endowment of the foundation would provide the funds for establishing these machines. Geographical redundancy provides optimal global accessibility and guards data against the occasional "bad asteroid day" and other natural and anthropogenic disasters. In order to immortalise a particular set of data (any arbitrary string of bits whatsoever), you submit it to a Foundation server along with a payment based on the length of the data, equivalent to the page charges of a printed journal. Oh yes, you get to supply a file name along with the data--anything you like--it's just a string of bits as well.
After validating the payment transaction, the Foundation assigns a unique key to the submitted data, mirrors it to all servers on all continents, backs it up to an archival write-once medium at each site, and after that medium is dismounted from the drive and stored offsite, returns a URL ticket to the submitter, which might look something like:
The payment to immortalise a set of data would be calculated so as to provide, in the sense of the price of a cemetery plot, "perpetual care"--it would be a one-time payment adequate to amortise the Foundation's cost to maintain the data at the assigned URL forever, regardless of what changes in technology may occur over the centuries. (Of course, the Foundation cannot guarantee the syntax of URLs, etc. will remain constant, but it can provide a mapping from old names to new names based on the assigned ticket number and submitted file name.)
Since storage, processing, and communications costs will surely fall over time, one can be confident a present-day payment can be amortised over the future. The price the Foundation charges to immortalise data would itself be expected to decline over the years, perhaps discontinuously in the face of, say, the advent of diamond storage cubes.
The Foundation's servers would explicitly not attempt to play the role of a high-performance ISP; there would be some form of explicit limitation on the number of downloads of a file, or bytes per day, or maximum bandwidth delivered, or something, which one might be able to increase by paying a premium. This would have to be carefully thought out to prevent runaway communications costs or unacceptably poor response. The Foundation would be free to loosen the restrictions as it saw fit based on the evolution of communication technology and cost.
The Foundation could also engage in silly but high profile stunts like buying piggyback space on GEO comsats for ultra-high-density radiation hard backups of the database every five or ten years, just in case the Earth happened to be destroyed. The lunar and martian mirror sites will have to await technological progress in several areas.
In addition to preserving archival scientific data, I expect the Foundation would appeal to clients of vanity publishing houses, anybody who's had their ISP go out of business or merge, and, in time, as the Foundation's reputation grew and appropriate security measures were put in place (encryption on the customer side, not by the Foundation), as an Iron Mountain-like repository for corporate records.
Too crazy? Crazy enough to work?