Re: [PATCH RFC 03/13] charsets: utf8: Add unicode character database files

Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxxxx> · Sat, 13 Jan 2018 02:28:05 -0200

Theodore Ts'o <tytso@xxxxxxx> writes:

> On Fri, Jan 12, 2018 at 05:12:24AM -0200, Gabriel Krisman Bertazi wrote:
>> From: Olaf Weber <olaf@xxxxxxx>
>> 
>> Add files from the Unicode Character Database, version 7.0.0, to the source.
>> A helper program that generates a trie used for normalization from these
>> files is part of a separate commit.
>
> It looks like the latest version of Unicode is 10.0.0.  Once we pick a
> Unicode version, changing will be painful; but in the absence of
> interop requirements, is there a reason to stick with Unicode 7?  Why
> not take the latest version of Unicode and then freeze on it?
>

Hi Ted,

No, there isn't a specific reason for unicode 7 and I forgot to mention
this in my cover letter.  I have successfully generated the data file
for 10.0.0 with the mkutf8data script, but I couldn't validate it
entirely yet.  I walked through changelogs to make sure any relevant
changes where there, but I'm not done yet.  You can definitely expect
new versions of the patchset to support 10.0.0.

Thanks,

-- 
Gabriel Krisman Bertazi