--On Monday, 20 February, 2006 17:25 +0100 Peter Dambier <peter@xxxxxxxxxxxxxxxx> wrote: > Once upon a time the used to be computers speaking ASCII or > EBCDIC. The ASCII computers where unix mostly. >... Actually, Peter, at the time FTP was designed, the "ASCII computers" were mostly PDP-10s running Tenex or ITS, plus less than a handful of Multics machines. Unix wasn't really a major presence on the network yet. And that is important because the PDP-10 and Multics machines were both 36-bit environments, neither of which stored 7bit ASCII in octets. The PDP-10 normally used five characters per word with the ASCII in 7 bits (a very hard environment for UTF-8 or even "Latin-1") and Multics normally stored ASCII right-justified in a nine-bit field with the two leading bits set to zero). So "convert to network ASCII" was a non-trivial operation for almost everyone until ASCII-native 32 bit machines started to become prominent on the network -- among other things we needed it to get back and forth between Multics, ITS, and Tenex systems all of which were more or less ASCII-based. You are correct about the EBCDIC character conversions. I've lose my memory about the state of "virtual card decks" at the time, but my vague recollection is that text files of the type that were likely to be transmitted over the network were at least as likely to stored in files with variable length records (lengths determined by counts, rather than character-delimiters) than as fixed-length 80 (or 72) records. > Or you could print it directly on an ASR-33 terminal. My (also vague) recollection is that the ASR-33 terminal was upper-case-only and hence could not fully support ASCII. I do remember (fondly except for the racket and speed) some KSR-38s and maybe ASR-38s that were ASCII devices. > As the ASR-33 terminal did not know UTF-8 it is not a good > idea to use ASCII mode for UTF-8. But you can send it only to > systems who understand UTF-8. Sandeep's question raises another interesting issue. I just went back and reread RFC 2640. It does not seem to address the "TYPE A" issue at all. It does say (Section 2, paragraph 1) "Clients and servers are, however, under no obligation to perform any conversion on the contents of a file for operations such as STOR or RETR", which I would take to imply that it anticipates I18N FTP operations to be entirely binary ("TYPE I") although that is not explicit. Whether the characters in use are UTF-8 or not, we've still got that issue with line-endings. Based on the many things we have learned about internationalization in the last half-dozen years -- demonstrated by the changes from Unicode 2.0 through 5.x, introduction of IDNA, the "LTRU" work on language tagging, a growing feeling in some quarters that RFC 2277 may have been a bit naive in some ways-- it is probably time to revisit 2640. Not only may some of its requirements not be quite right, but it may be time to invent a "TYPE U" that would process and transmit UTF-8 on the same basis that RFC 959's "TYPE A" does for ASCII: mandatory conversion to that form if needed and CRLF line-endings. john _______________________________________________ Ietf@xxxxxxxx https://www1.ietf.org/mailman/listinfo/ietf