On Sat, Feb 9, 2019 at 3:08 PM brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> wrote: > [...] > Add a Makefile and #define knob, ICONV_NEEDS_BOM, that can be set if the > iconv implementation has this behavior. When set, Git will write a BOM > manually for UTF-16 and UTF-32 and then force the data to be written in > UTF-16BE or UTF-32BE. We choose big-endian behavior here because the > tests use the raw "UTF-16" encoding, which will be big-endian when the > implementation requires this knob to be set. The name ICONV_NEEDS_BOM makes it sound as if we must feed a BOM _into_ 'iconv', which is quite confusing since the actual intention is that 'iconv' doesn't emit a BOM and we need to make up for the deficiency. Using a name such as ICONV_OMITS_BOM or ICONV_NEGLECTS_BOM makes it somewhat clearer that there is some deficiency with which we need to deal. > Signed-off-by: brian m. carlson <sandals@xxxxxxxxxxxxxxxxxxxx> > --- > diff --git a/Makefile b/Makefile > @@ -259,6 +259,9 @@ all:: > +# Define ICONV_NEEDS_BOM if your iconv implementation does not write a > +# byte-order mark (BOM) when writing UTF-16 or UTF-32. Not a big deal, but I wonder if it would be helpful to tack on "..., in which case it outputs big-endian unconditionally." or something. > diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh > @@ -6,6 +6,25 @@ test_description='working-tree-encoding conversion via gitattributes' > +test_lazy_prereq NO_UTF16_BOM ' > + test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 > +' > + > +test_lazy_prereq NO_UTF32_BOM ' > + test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 > +' > + > +write_utf16 () { > + test_have_prereq NO_UTF16_BOM && printf '\xfe\xff' > + iconv -f UTF-8 -t UTF-16 > + > +} Stray blank line before the closing brace. > + > +write_utf32 () { > + test_have_prereq NO_UTF32_BOM && printf '\x00\x00\xfe\xff' > + iconv -f UTF-8 -t UTF-32 > +} It's probably doesn't matter much with these two tiny functions, but I was wondering if it would make sense to maintain the &&-chain, perhaps like this: if test test_have_prereq NO_UTF32_BOM then printf '\x00\x00\xfe\xff' fi && iconv -f UTF-8 -t UTF-32