Re: inconsistent file placement

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jul 05, 2010 at 06:49:34PM -0700, Daniel Taylor wrote:
> I realize that it is enerally not a good idea to tune
> an operating system, or subsystem, for benchmarking, but
> there's something that I don't understand about ext[234]
> that is badly affecting our product.  File placement on
> newly-created file systems is inconsistent.  I can't,
> yet, call it a bug, but I really need to understand what
> is happening, and I cannot find, in the source code, the
> source of the randomization (related to "goal"???).

In ext3, it really is random.  The randomness you're looking for can
be found in fs/ext3/ialloc.c:find_group_orlov(), when it calls
get_random_bytes().  This is responsible for "spreading" directories
so they are spread across the block groups, to try to prevent
fragmented files.  Yes, if all you care about is benchmarks which only
use 10% of the entire file system, and for which the benchmarks don't
adequately simulate file system aging, the algorithms in ext3 will
cause a lot of variability.

Yes, if you use FAT-style algorithms which try to use the first free
inode, and first free block which is available, for the purposes of
competitive benchmarking (especially if the benchmarks are crap), you
can probably win against the competition.  Unfortunately, long-term
your product will probably far more likely to suffer from file system
aging as the blocks at the beginning of the file system are badly
fragmented.  Please don't do that, though (or, if you must, please
have a switch so that users can switch it from "competitive
benchmarking mode" to "friendly to real life users" mode).

Ext4 uses very different algorithms, and it's not strictly speaking
random since it uses a cur-down md4 hash of the directory name to
decide where to place the directory inode (and the location of the
directory inode, affects both the files created in that inode as well
as the blocks allocated to those files, as in ext3).  So as long as
the directory hash seed in the superblock stays constant, and the
directory and file names created stay constant, the inode and block
layout will also be consistent.

All of this having been said, it may very well be possible to improve
on the anti-fragmentation algorithms while still trying to allocate
block groups closer to the beginning of the disk to take advantage of
the inner-diamater/outer-diameter placement effect.  There's probably
room for some research work here.  But please do be careful before
twiddling too much with the allocator algorithms, they are somewhat
subtle....

						- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux