[Please skip using Reply-To and instead of Mail-Followup-To so that responses also go to the list.] On Thu, Feb 07, 2019 at 10:59:35PM +0100, Kevin Daudt wrote: > I'm trying to get the git test suite passing on Alpine Linux, which is > based on musl libc. > > All tests in t0028-working-tree-encoding.sh are currently failing, > because musl iconv does not support statefull output of UTF-16/32 (eg, > it does not output a BOM), while git is expecting that to be present: > > > hint: The file 'test.utf16' is missing a byte order mark (BOM). Please > > use UTF-16BE or UTF-16LE (depending on the byte order) as > > working-tree-encoding. > > fatal: BOM is required in 'test.utf16' if encoded as utf-16 > > Because adding the file to get fails, all the other tests fail as well > as they expect the file to be present in the repository. > > Any idea how to get around this? I think musl needs to patch their libc. RFC 2781 says that if there's no BOM in UTF-16, then "the text SHOULD be interpreted as being big-endian." Unfortunately for all of us, many Windows-based programs have chosen to ignore that advice (technically, it's only a SHOULD) and interpret it as little-endian instead. Git can't safely assume anything about the endianness of a UTF-16 stream that doesn't contain a BOM. Technically, since the RFC doesn't specify a MUST requirement, musl can't, either. Even if Git were to produce a BOM to work around this issue, then we'd still have the problem that any program using musl will write data in UTF-16 without a BOM. Moreover, because musl, in violation of the RFC, doesn't read and process BOMs, someone using little-endian UTF-16 (with a proper BOM) with musl and Git will have their data corrupted, according to my reading of the musl website. In other words, I believe this test is failing legitimately. -- brian m. carlson: Houston, Texas, US OpenPGP: https://keybase.io/bk2204
Attachment:
signature.asc
Description: PGP signature