RE: [PATCH RFC v6 04/11] unicode: reduce the size of utf8data[]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



From: Theodore Ts'o
> On Mon, Mar 18, 2019 at 04:27:38PM -0400, Gabriel Krisman Bertazi wrote:
> > From: Olaf Weber <olaf@xxxxxxx>
> >
> > Remove the Hangul decompositions from the utf8data trie, and do
> > algorithmic decomposition to calculate them on the fly. To store
> > the decomposition the caller of utf8lookup()/utf8nlookup() must
> > provide a 12-byte buffer, which is used to synthesize a leaf with
> > the decomposition. Trie size is reduced from 245kB to 90kB.
> 
> I'm seeing sizes much smaller; the actual utf8data[] array is 63,584.
> And size utf8-norm.o reports:
> 
>    text	   data	    bss	    dec	    hex	filename
>   68752	     96	      0	  68848	  10cf0	fs/unicode/utf8-norm.o
> 
> Were you measuring the size of the utf8-norm.o file?  That will vary
> in size depending on whether debugging symbols are enabled, etc.
> 
>    		     	     	       	       - Ted

These numbers came from the size of the array reported in utf8data.h,
and were correct for the NFKDI + NFKDICF normalizations for Unicode 9.
The switch to NFDI + NFDICF reduced the size, and it looks like the commit
message was not updated to account for this.

Olaf




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux