--On Saturday, December 2, 2017 14:20 -0500 Phillip Hallam-Baker <phill@xxxxxxxxxxxxxxx> wrote: > I think we could use a small piece of technology to make this > sort of thing easier to manage. > > The thing about formats is that if one person uses the format, > the means of processing can be lost. But if you can get any > sizable number of people to use a format, maybe as few as > 10,000, and sufficient information encoded in it, then there > will be sufficient incentive to decode it. I think it depends on the time frame. Let me give two examples within the life spans of at least many of the people reading this list. At one time, there were a widely-used set of word processing and calculation products made by Wang Corporation and its imitators. Their earliest generations that used demountable storage media used 8 inch floppy disks; IIR Wang led the charge for what because the nominal 5.25 inch format because the eight inch ones were just too big for the devices. For more than 10,000 people used them. I would guess that it would be rather hard to find a way to read and interpret those disks today and that it isn't getting any easier. A perhaps more interesting example was a somewhat more recent device called, more or less, the IBM Magnetic Card Selectric Typewriter. The medium used had the same form factor as paper punched cards, but had magnetic material on one side. It was almost exactly what the name sounds like: a typewriter with a memory device attached that permitted storing (and, more important, correcting) a document on those magnetic cards, and then playing them back on the typewriter mechanism, a mechanism that was also used in a few popular computer terminals of the period. The device could be seen as evolutionary from ASR teletypes, but the advantages of magnetic media for test that needed to be corrected and revised over punched paper tape were obvious. Even ignoring the question of bits dropping out, care to speculate on devices to read those cards and decode the formats (I don't recall if the latter were ever made public) today. If you want to look at formats, rather than storage media, consider WordStar or first-generation runoff and its non-U**x immediate successors. Ever hear of "Compose"? Usage of each of those things almost certainly far exceeded your 10K. If one can recover the bits (want to borrow a spare 6250 BPI 9 track tape drive or the earlier and lower density 7 and 9 track versions?), those formats are recoverable because the are basically known character encodings of the text (many, but not all, ISO 646 variations) with a relatively small amount of format markup that is easily parsed out. What all of the above share is that the most stable and efficient ways of recovering and using the documents they produced is to find paper copies of their output and scan them rather than trying to fuss with the magnetic media and native formats. It is not a coincidence that, when the community set out to recover the earlier RFCs and made them machine-readable -- documents of which many had been prepared using a variety of those early systems -- the mechanisms used were "scanning" and "typing in and proofreading", not attempts at recovery of the digital forms. And that is all less than, or about, 50 years. I do believe we are getting better at this sort of thing, as John and others have pointed out. However, I think the issues need to be taken seriously and that neither use by a lot of people or procedures to periodically recopy or widely distribute copies of the bits are sufficient, even though they may be important steps or tools. As to standard and widely-used formats, I continue to believe in them too, but the success record so far is not grounds for long-term optimism. best, john