Re: [Json] BOMs (Was: Re: JSON: remove gap between Ecma-404 and IETF draft)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Pete Cordell scripsit:

> Do you mean that the presence of a UTF-8 BOF sequence doesn't prove
> that it's not Windows cp-1252 or do you mean you can tell apart a
> UTF-8 and cp-1252 file without BOMs?

I meant the latter, but the former is true, too.  A plain text document
beginning "" in Windows-1252 will appear to begin with an 8-BOM
in the absence of out of band information.

> If the latter, do the relevant tools take the time to distinguish
> the 2 without BOMs?

Some tools do, some don't.  The IRC client I use, XChat, attempts to
convert input as UTF-8, and if that fails, converts it as Latin-1.
I have not yet seen it produce mojibake.

-- 
John Cowan   cowan@xxxxxxxx  http://www.ccil.org/~cowan
Most languages are dramatically underdescribed, and at least one is
dramatically overdescribed.  Still other languages are simultaneously
overdescribed and underdescribed.  Welsh pertains to the third category.
        --Alan King




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]