Re: [Json] BOMs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 19, 2013 at 4:31 AM, Bjoern Hoehrmann <derhoermi@xxxxxxx> wrote:
* Tatu Saloranta wrote:
>Dominant Java implementations support UTF-16 with BOM; either directly or
>through Java's Reader implementations that handle BOMs.
>String concatenation case seems irrelevant, since BOMs are not included in
>in-memory representation anyway, as opposed to byte stream serialization.

HTTP implementations cannot correctly determine whether an entity body
is text in a single character encoding and if so what that encoding is,
accordingly the dominant API deals in byte[] arrays, not text Strings;
furthermore, many programming languages default to byte[] arrays for
string literals. That often combines into forms of

  byte[] json = sprintf('{"x": %s, "y": %s}', GET(...), GET(...));

which works fine if all three byte[] arrays are UTF-8 encoded and use
no Unicode signature, which is the case 99% of the time.

My point was just that although it appears that many scripting languages may not deal with BOM properly, same is not true on all platforms. Proper JSON APIs on JVM do accept both String and byte[] based input; byte[] being preferred since it is more efficient, and reliably with auto-detection, assuming that -- as per JSON specification -- the only single-byte encoding used is UTF-8.

-+ Tatu +-

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Fedora Users]