On Fri, 2014-09-19 at 11:06 -0500, Ben Myers wrote: > +#define AGE_NAME "DerivedAge.txt" > +#define CCC_NAME "DerivedCombiningClass.txt" > +#define PROP_NAME "DerivedCoreProperties.txt" > +#define DATA_NAME "UnicodeData.txt" > +#define FOLD_NAME "CaseFolding.txt" > +#define NORM_NAME "NormalizationCorrections.txt" > +#define TEST_NAME "NormalizationTest.txt" Is there a reason why you're using multiple text-based data files (and hand-parsing them) when there's an xml formatted flat file available ? http://www.unicode.org/Public/UCD/latest/ucdxml/ And a 2nd question - why does the trie need to encode "the the unicode version in which the codepoint was assigned an interpretation" ? -- Roger Willcocks <roger@xxxxxxxxxxxxxxxx> -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html