On Fri, Aug 10, 2007 at 02:34:14PM +0800, Fengguang Wu wrote: > Properly manage the 3 queues of sb->s_dirty/s_io/s_more_io so that > - time-ordering of dirtied_when can be easily maintained > - writeback can continue from where previous run left out > > The majority work has been done by Andrew Morton and Ken Chen, > this patch just clarifies the roles of the 3 queues: > - s_dirty for io delay(up to dirty_expire_interval) > - s_io for io run(a full scan of s_io may involve multiple runs) > - s_more_io for io continuation > > The following paradigm shows the data flow. > > requeue on new scan(empty s_io) > +-----------------------------+ > | | > dirty old | | > inodes enough V | > ======> s_dirty ======> s_io | > ^ | requeue io | > | +---------------------> s_more_io > | hold back | > +----------------+----------> disk write requests > > sb->s_dirty: a FIFO queue > - s_dirty hosts not-yet-expired(recently dirtied) dirty inodes > - once expired, inodes will be moved out of s_dirty and *never put back* > (unless for some reason we have to hold on the inode for some time) > > sb->s_io and sb->s_more_io: a cyclic queue scanned for io > - on each run of generic_sync_sb_inodes(), some more s_dirty inodes may be > appended to s_io > - on each full scan of s_io, all s_more_io inodes will be moved back to s_io > - large files that cannot be synced in one run will be moved to s_more_io for > retry on next full scan In fact s_more_io is no longer necessary. We end up with a priority io-delaying queue s_dirty and a cyclic io-syncing queue s_io. They are properly decoupled. More flexible data structure can be used for s_dirty, if we want to redirty an inode with arbitrary delays. Also more priority queues can be introduced in addition to s_dirty. For example, we can designate a queue s_dirty_atime for inodes dirtied only by `atime', and sync them lazily. > inode->dirtied_when > - inode->dirtied_when is updated to the *current* jiffies on pushing into > s_dirty, and is never changed in other cases. > - time-ordering thus can be simply ensured while moving inodes between lists, > since (time order == enqueue order) > > Cc: Ken Chen <kenchen@xxxxxxxxxx> > Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> > Signed-off-by: Fengguang Wu <wfg@xxxxxxxxxxxxxxxx> > --- > fs/fs-writeback.c | 106 +++++++++++++++++++++----------------------- > 1 file changed, 52 insertions(+), 54 deletions(-) - To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html