On Thu, Sep 18, 2014 at 03:15:19PM -0500, Ben Myers wrote: > From: Olaf Weber <olaf@xxxxxxx> > > mkutf8data.c is the source for a program that generates utf8data.h, which > contains the trie that utf8norm.c uses. The trie is generated from the > Unicode 7.0.0 data files. The format of the utf8data[] table is described > in utf8norm.c. > > Supporting functions for UTF-8 normalization are in utf8norm.c with the > header utf8norm.h. Two normalization forms are supported: nfkdi and nfkdicf. > > nfkdi: > - Apply unicode normalization form NFKD. > - Remove any Default_Ignorable_Code_Point. > > nfkdicf: > - Apply unicode normalization form NFKD. > - Remove any Default_Ignorable_Code_Point. > - Apply a full casefold (C + F). > > For the purposes of the code, a string is valid UTF-8 if: > > - The values encoded are 0x1..0x10FFFF. > - The surrogate codepoints 0xD800..0xDFFFF are not encoded. > - The shortest possible encoding is used for all values. > > The supporting functions work on null-terminated strings (utf8 prefix) and > on length-limited strings (utf8n prefix). > > Signed-off-by: Olaf Weber <olaf@xxxxxxx> > > --- > [v2: the trie is now separated into utf8norm.ko; > utf8version is now a function and exported; > introduced CONFIG_XFS_UTF8. -bpm] > --- > fs/xfs/Kconfig | 8 + > fs/xfs/Makefile | 2 +- > fs/xfs/utf8norm/Makefile | 37 + > fs/xfs/utf8norm/mkutf8data.c | 3239 ++++++++++++++++++++++++++++++++++++++++++ > fs/xfs/utf8norm/utf8norm.c | 649 +++++++++ > fs/xfs/utf8norm/utf8norm.h | 116 ++ Again, nothing XFS specific here. It's being built as a separate module and the only thing that XFS uses are exported functions, so it really should be generic library code.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html