On Wed, Sep 01, 2010 at 01:30:41AM +0200, Michael Monnerie wrote: > I'm just trying the delaylog mount option on a filesystem (LVM over > 2x 2TB 4K sector drives), and I see this while running 8 processes > of "rm -r * & 2>/dev/null": > > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > sdc 2,80 33,40 125,00 64,60 720,00 939,30 17,50 0,55 2,91 1,71 32,40 > sdd 0,00 25,60 122,80 63,40 662,40 874,40 16,51 0,52 2,77 1,96 36,54 > dm-0 0,00 0,00 250,60 123,00 1382,40 1941,70 17,79 1,64 4,39 1,74 65,08 > > Then I issue "sync", and utilisation increases: > Device: rrqm/s wrqm/s r/s w/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util > sdc 0,00 0,20 15,80 175,40 84,00 2093,30 22,78 0,62 3,26 2,93 55,94 > sdd 0,00 1,00 13,40 177,60 79,20 2114,10 22,97 0,69 3,63 3,34 63,80 > dm-0 0,00 0,00 29,20 101,20 163,20 4207,40 67,03 1,11 8,51 7,56 98,60 > > This is reproducible. You're probably getting RMW cycles on inode writeback. I've been noticing this lately with my benchmarking - the VM is being _very aggressive_ reclaiming page cache pages vs inode caches and as a result the inode buffers used for IO are being reclaimed between the time it takes to create the inodes and when they are written back. Hence you get lots of reads occurring during inode writeback. By issuing a sync, you clear out all the inode writeback and all the RMW cycles go away. As a result, there is more disk throughput availble for the unlink processes. There is a good chance this is the case as the number of reads after the sync drop by an order of magnitude... > Now it can be that the sync just causes more writes and stalls reads > so overall it's slower, but I'm wondering why none of the devices says "100% util", which > should be the case on deletes? Or is this again the "mistake" of the utilization calculation > that writes do not really show up there? You're probably CPU bound, not IO bound. > I know I should have benchmarked and tested, I just wanted to raise eyes on this as it > could be possible there's something to optimize. > > Another strange thing: After the 8 "rm -r" finished, there were some subdirs left over > that hadn't been deleted - running one "rm -r" cleaned them out then. Could that be > a problem with "delaylog"? Unlikely - files not being deleted is not a function of the way transactions are written to disk. It's a function of whether the operation was performed or not. > Or can that happen when several "rm" compete in the same dirs? Most likely. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs