On Tue, Jul 02, 2024 at 07:22:25AM +0200, Christoph Hellwig wrote: > On Mon, Jul 01, 2024 at 05:58:10PM -0700, Darrick J. Wong wrote: > > From: Darrick J. Wong <djwong@xxxxxxxxxx> > > > > I missed a few non-rendering code points in the "zero width" > > classification code. Add them now, and sort the list. > > > > $ wget https://www.unicode.org/Public/UCD/latest/ucd/UnicodeData.txt > > $ grep -E '(zero width|invisible|joiner|application)' -i UnicodeData.txt > > Should this be automated? That will require a bit more thought -- many distro build systems these days operate in a sealed box with no network access, so you can't really automate this. libicu (the last time I looked) didn't have a predicate to tell you if a particular code point was one of the invisible ones. Here's what I get from running the commands right now: 009F;<control>;Cc;0;BN;;;;;N;APPLICATION PROGRAM COMMAND;;;; 034F;COMBINING GRAPHEME JOINER;Mn;0;NSM;;;;;N;;;;; 200B;ZERO WIDTH SPACE;Cf;0;BN;;;;;N;;;;; 200C;ZERO WIDTH NON-JOINER;Cf;0;BN;;;;;N;;;;; 200D;ZERO WIDTH JOINER;Cf;0;BN;;;;;N;;;;; 2060;WORD JOINER;Cf;0;BN;;;;;N;;;;; 2061;FUNCTION APPLICATION;Cf;0;BN;;;;;N;;;;; 2062;INVISIBLE TIMES;Cf;0;BN;;;;;N;;;;; 2063;INVISIBLE SEPARATOR;Cf;0;BN;;;;;N;;;;; 2064;INVISIBLE PLUS;Cf;0;BN;;;;;N;;;;; 2D7F;TIFINAGH CONSONANT JOINER;Mn;9;NSM;;;;;N;;;;; FEFF;ZERO WIDTH NO-BREAK SPACE;Cf;0;BN;;;;;N;BYTE ORDER MARK;;;; 1107F;BRAHMI NUMBER JOINER;Mn;9;NSM;;;;;N;;;;; 11A47;ZANABAZAR SQUARE SUBJOINER;Mn;9;NSM;;;;;N;;;;; 11A99;SOYOMBO SUBJOINER;Mn;9;NSM;;;;;N;;;;; 11F42;KAWI CONJOINER;Mn;9;NSM;;;;;N;;;;; 13430;EGYPTIAN HIEROGLYPH VERTICAL JOINER;Cf;0;L;;;;;N;;;;; 13431;EGYPTIAN HIEROGLYPH HORIZONTAL JOINER;Cf;0;L;;;;;N;;;;; The five-digit ones are new to me; I'll go have a look at how the noto(fu) fonts render those. > Otherwise looks good: > > Reviewed-by: Christoph Hellwig <hch@xxxxxx> Thanks! --D