On Fri, Feb 03, 2012 at 09:17:41PM +0000, Brian Candler wrote: > On Fri, Feb 03, 2012 at 09:01:14PM +0000, Brian Candler wrote: > > I created a fresh filesystem (/dev/sdh), default parameters, but mounted it > > with inode64. Then I tar'd across my corpus of 100K files. Result: files > > are located close to the directories they belong to, and read performance > > zooms. > > Although perversely, keeping all the inodes at one end of the disk does > increase throughput with random reads, and also under high concurrency loads > (for this corpus of ~65GB anyway, maybe not true for a full disk) > > -- original results: defaults without inode64 -- > > #p files/sec dd_args > 1 43.57 bs=1024k > 1 43.29 bs=1024k [random] > 2 51.27 bs=1024k > 2 48.17 bs=1024k [random] > 5 69.06 bs=1024k > 5 63.41 bs=1024k [random] > 10 83.77 bs=1024k > 10 77.28 bs=1024k [random] > > -- defaults with inode64 -- > > #p files/sec dd_args > 1 138.20 bs=1024k > 1 30.32 bs=1024k [random] > 2 70.48 bs=1024k > 2 27.25 bs=1024k [random] > 5 61.21 bs=1024k > 5 35.42 bs=1024k [random] > 10 80.39 bs=1024k > 10 45.17 bs=1024k [random] > > Additionally, I see a noticeable boost in random read performance when using > -i size=1024 in conjunction with inode64, which I'd also like to understand: > > -- inode64 *and* -i size=1024 -- > > #p files/sec dd_args > 1 141.52 bs=1024k > 1 38.95 bs=1024k [random] > 2 67.28 bs=1024k > 2 42.15 bs=1024k [random] > 5 79.83 bs=1024k > 5 57.76 bs=1024k [random] > 10 86.85 bs=1024k > 10 72.45 bs=1024k [random] Directories probably take less IO to read because they remain in short/extent form rather than moving to leaf/node (btree) format because you can fit more extent records in line in the inode. That means probably 1 IO less per random read. However, it has other downsides, like requiring 4x as much IO to read and write the same number of inodes when under memory pressure (e.g. when you app is using 98% of RAM). Basically, you are discovering how to tune your system for optimal performance with a given set of bonnie++ parameters. Keep in mind that's exactly what we suggest you -don't- do when tuning a filesystem: http://xfs.org/index.php/XFS_FAQ#Q:_I_want_to_tune_my_XFS_filesystems_for_.3Csomething.3E Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs