On Wed, 2010-09-22 at 16:44 +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Under heavy multi-way parallel create workloads, the VFS struggles > to write back all the inodes that have been changed in age order. > The bdi flusher thread becomes CPU bound, spending 85% of it's time > in the VFS code, mostly traversing the superblock dirty inode list > to separate dirty inodes old enough to flush. > > We already keep an index of all metadata changes in age order - in > the AIL - and continued log pressure will do age ordered writeback > without any extra overhead at all. If there is no pressure on the > log, the xfssyncd will periodically write back metadata in ascending > disk address offset order so will be very efficient. > > Hence we can stop marking VFS inodes dirty during transaction commit > or when changing timestamps during transactions. This will keep the > inodes in the superblock dirty list to those containing data or > unlogged metadata changes. This looks good. There is a minor typo I'll highlight below in case you want to fix it. > However, the timstamp changes are slightly more complex than this - > there are a couple of places that do unlogged updates of the > timestamps, and the VFS need to be informed of these. Hence add a > new function xfs_trans_inode_chgtime() for transactional changes, You actually used the name "xfs_trans_ichgtime". > and leave xfs_ichgtime() for the non-transactional changes. I haven't updated my cscope database, but it looks to me like this leaves just one spot where xfs_ichtime() is still used. Namely, xfs_setattr(), when truncating a zero-length file. > > Signed-off-by: Dave Chinner <dchinner@xxxxxxxxxx> Reviewed-by: Alex Elder <aelder@xxxxxxx> . . . > diff --git a/fs/xfs/linux-2.6/xfs_iops.c b/fs/xfs/linux-2.6/xfs_iops.c > index b1fc2a6..37918f4 100644 > --- a/fs/xfs/linux-2.6/xfs_iops.c > +++ b/fs/xfs/linux-2.6/xfs_iops.c > @@ -96,40 +96,63 @@ xfs_mark_inode_dirty( > > /* . . . > + if (xfs_ichgtime_int(ip, flags)) > xfs_mark_inode_dirty_sync(ip); > } > > /* > + * Transactional inode timestamp update. requires inod to be locked and joined s/inod /inode / > + * to the transaction supplied. Relies on the transaction subsystem to track . . . _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs