Re: [PATCH v3 08/12] ext2fs: nls: Support UTF-8 11.0 with NFKD normalization

"Theodore Y. Ts'o" <tytso@xxxxxxx> · Fri, 30 Nov 2018 11:53:17 -0500

On Mon, Nov 26, 2018 at 05:19:45PM -0500, Gabriel Krisman Bertazi wrote:
> From: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxxxx>
> 
> We need this such that we can do normalization and casefolding
> compatible with the kernel, in order to properly support fsck
> verification and rehashing.
> 
> The UTF-8 11.0 implementation is copied and adapted from the kernel code
> to ensure maximum compatibility.  The decode trie in utf8data.h is
> generated using a script and the UCD sources in the kernel code.
> 
> Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxxxx>

One more thought.  Is there any test cases we can add here?  I assume
the SGI folks must have had some test code that they used when they
were developing their trie code.  Was any of that released?

Maybe there is some Unicode normalization and case folding test
vectors we can grab?

Thanks,

     		      	   	      	  - Ted