Re: [PATCH 1/3] update_unicode.sh: update the uniset repo if it exists

Junio C Hamano <gitster@xxxxxxxxx> · Mon, 12 Dec 2016 10:33:00 -0800

Torsten Bögershausen <tboegi@xxxxxx> writes:

> If I run ./update_unicode.sh on the latest master of
> https://github.com/depp/uniset.git , commit
> a5fac4a091857dd5429cc2d, I get a diff in unicode_width.h like
> this:
>
> -{ 0x0300, 0x036F },
>
> +{ 768, 879 },
>
> IOW, all hex values are printed as decimal values.
> Not a problem for the compiler, but for the human
> to check the unicode tables.
>
> So I think we should "pin" the version of uniset.

Sure, and I'd rather see the update-unicode.sh script moved
somewhere in contrib/ while at it.  Those who are interested in
keeping up with the unicode standard are tiny minority of the
developer population, and most of us would treat the built width
table as the source (after all, that is what we ship).

To be bluntly honest, I'd rather not to see "update-unicode.sh"
download and build uniset at all.  It's as if po/ hierarchy shipping
with its own script to download and build msgmerge--that's madness.
Needless to say, shipping the sources for uniset embedded in our
project tree (either as a snapshot-fork or as a submodule) is even
worse.  Those who want to muck with po/ are expected to have
msgmerge and friends.  Why not expect the same for those who want to
update the unicode width table?

I'd rather see a written instruction telling which snapshot to get
and from where to build and place on their $PATH in the README file,
sitting next to the update-unicode.sh script in contrib/uniwidth/
directory, for those who are interested in building the width table
"from the source", and the update-unicode.sh script to assume that
uniset is available.