Re: Allocation strategy - dynamic zone for small files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2006-11-13 at 16:57 -0700, Andreas Dilger wrote:
> On Nov 14, 2006  00:32 +0100, Ihar `Philips` Filipau wrote:
> > As person throwing in the idea, I feel bit responsible. So here go my
> > results from my primitive script (bear with my bashism) on my plain
> > Debian/unstable with 123k files on 10GB partition with ext3, default
> > 8K block.
> > 
> > Script to count small files:
> > -+-
> > #!/bin/bash
> > find / -xdev 2>/dev/null | wc -l
> > find / -xdev -\( $(seq -f '-size %gc -o' 1 63) -false -\) 2>/dev/null | wc 
> > -l
> > find / -xdev -\( $(seq -f '-size %gc -o' 64 128) -false -\) 2>/dev/null | 
> > wc -l
> > -+-
> > First line to find all files on root fs, second to find all files with
> > sizes 1-63 bytes, third - 64-128. (Param '-xdev' tells find to remain
> > on same fs to exclude proc/sys/tmp and so on)
> > 
> > And on my system counts are:
> > -+-
> > 107313
> > 8302
> > 2618
> > -+-
> > 
> > This is 10.1% of all files - are small files under 128 bytes. (7.7% < 63 
> > bytes)
> > 
> > [ Results for /etc: 1712, 666, 143 (+ 221 file of size in range
> > 129-512 bytes) - small files are better half of whole /etc. ]
> 
> Note that using the root filesystem is a skewed result (esp. on GTK systems
> where lots of single-valued files are used by gconf).  Many root filesystems
> using ext3 are formatted with 1kB blocks for this reason.  Also gather stats
> for other filesystems.
> 
> At the filesystem summit we DID find a surprising number of small files
> even when the whole system was examined.  We discussed storing small
> files directly in the inode along with other EAs (this would require
> larger inodes).  This improves data locality and performance (i.e. stat
> of the file loads the small file data into cache), though the assumption
> is that there will be an increasing number of EAs on files in the future.
> It also avoids the issues w.r.t. packing file data from different files
> into the same block and they have different lifespans, etc.

I would agree that if the focus is on files that are 128 bytes or
smaller, storing the data in the inode makes the most sense.  I don't
think it's worth the complexity to doing any kind of tail merging unless
you would expect that a large number of small files would be too big to
practically fit in the inode, but small enough that it is worth doing
something to store them efficiently.  Symbolic links have been stored
this way for a long time.

-- 
David Kleikamp
IBM Linux Technology Center

-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux