On Tue, Sep 07, 2010 at 08:20:07AM +0200, Michael Monnerie wrote: > On Dienstag, 7. September 2010 Dave Chinner wrote: > > # mkfs.xfs -n size=64k > > (-n = naming = directories. -d = data != directories) > > Thank you, Dave. Do I interpret that parameter right: > > When a new directory is created, per default it would occupy only 4KB, > with -n size=64k would be reserved. No, it allocates 64k blocks for the directory instead of 4k blocks. > As the directory fills, space within > that block will be used, so in the default case after 4KB (how many > inodes would that be roughly? 256 Bytes/Inode, so 16 entries?) XFS would > reserve the next block, but in your case 256 entries would fit. Inodes are not stored in the dirctory structure, only the directory entry name and the inode number. Hence the amount of space used by a directory entry is determined by the length of the name. > That would keep dir fragmentation lower, and with todays disks, take a > minimal more space, so it sounds very good to use that option. > Especially with RAIDs, where stripes usually are 64KB or bigger. Or > would the waste of space be so big that it could hurt? Well, there is extra overhead to allocate large directory blocks (16 pages instead of one, to begin with, then there's the vmap overhead, etc), so for small directories smaller block sizes are faster for create and unlink operations. For empty directorys, operations on 4k block sized directories consume roughly 50% less CPU that 64k block size directories. The 4k block size directoeies consume less CPU out to roughly 1.5 million entries where the two are roughly equal. At directory sizes of 10 million entries, 64k directory block operations are consuming about 15% of the CPU that 4k directory block operations consume. In terms of lookups, the 64k block directory will take less IO but consume more CPU for a given lookup. Hence it depends on your IO latency and whether directory readahead can hide that latency as to which will be faster. e.g. For SSDs, CPU usage might be the limiting factor, not the IO. Right now I don't have any numbers on what the difference might be - I'm getting 1B inode population issues worked out first before I start on measuring cold cache lookup times on 1B files.... > Last question: Is there a way to set that option on a given XFS? No, it is a mkfs time parameter, though we have been discussing the possibility of being able to set it per-directory (at mkdir time when no blocks have been allocated). Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs