Pete Cordell scripsit: > Do you mean that the presence of a UTF-8 BOF sequence doesn't prove > that it's not Windows cp-1252 or do you mean you can tell apart a > UTF-8 and cp-1252 file without BOMs? I meant the latter, but the former is true, too. A plain text document beginning "" in Windows-1252 will appear to begin with an 8-BOM in the absence of out of band information. > If the latter, do the relevant tools take the time to distinguish > the 2 without BOMs? Some tools do, some don't. The IRC client I use, XChat, attempts to convert input as UTF-8, and if that fails, converts it as Latin-1. I have not yet seen it produce mojibake. -- John Cowan cowan@xxxxxxxx http://www.ccil.org/~cowan Most languages are dramatically underdescribed, and at least one is dramatically overdescribed. Still other languages are simultaneously overdescribed and underdescribed. Welsh pertains to the third category. --Alan King