On Thu, Oct 13, 2011 at 04:57:22AM +0800, Jan Kara wrote: > Writeback of an inode can be stalled by things like internal fs locks being > held. So in case we didn't write anything during a pass through b_io list, > just wait for a moment and try again. When retrying is fruitless for a long > time, or we have some other work to do, we just stop current work to avoid > blocking flusher thread. > > CC: Christoph Hellwig <hch@xxxxxxxxxxxxx> > Reviewed-by: Wu Fengguang <fengguang.wu@xxxxxxxxx> > Signed-off-by: Jan Kara <jack@xxxxxxx> > --- > fs/fs-writeback.c | 39 +++++++++++++++++++++++++++------------ > 1 files changed, 27 insertions(+), 12 deletions(-) > > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c > index 04cf3b9..b619f3a 100644 > --- a/fs/fs-writeback.c > +++ b/fs/fs-writeback.c > @@ -699,8 +699,11 @@ static long wb_writeback(struct bdi_writeback *wb, > unsigned long wb_start = jiffies; > long nr_pages = work->nr_pages; > unsigned long oldest_jif; > - struct inode *inode; > long progress; > + long pause = 1; > + long max_pause = dirty_writeback_interval ? > + msecs_to_jiffies(dirty_writeback_interval * 10) : > + HZ; It's better not to put the flusher to sleeps more than 10ms, so that when the condition changes, we don't risk making the storage idle for too long time. So let's distinguish between accumulated and one-shot max pause time in the below code? The other changes look fine to me. Thanks, Fengguang > oldest_jif = jiffies; > work->older_than_this = &oldest_jif; > @@ -755,25 +758,37 @@ static long wb_writeback(struct bdi_writeback *wb, > * mean the overall work is done. So we keep looping as long > * as made some progress on cleaning pages or inodes. > */ > - if (progress) > + if (progress) { > + pause = 1; > continue; > + } > /* > * No more inodes for IO, bail > */ > if (list_empty(&wb->b_more_io)) > break; > /* > - * Nothing written. Wait for some inode to > - * become available for writeback. Otherwise > - * we'll just busyloop. > + * Nothing written (some internal fs locks were unavailable or > + * inode was under writeback from balance_dirty_pages() or > + * similar conditions). > */ > - if (!list_empty(&wb->b_more_io)) { > - trace_writeback_wait(wb->bdi, work); > - inode = wb_inode(wb->b_more_io.prev); > - spin_lock(&inode->i_lock); > - inode_wait_for_writeback(inode, wb); > - spin_unlock(&inode->i_lock); > - } > + /* If there's some other work to do, proceed with it... */ > + if (!list_empty(&wb->bdi->work_list) || > + (!work->for_background && over_bground_thresh())) > + break; > + /* > + * Wait for a while to avoid busylooping unless we waited for > + * so long it does not make sense to retry anymore. > + */ > + if (pause > max_pause) > + break; > + trace_writeback_wait(wb->bdi, work); > + spin_unlock(&wb->list_lock); > + __set_current_state(TASK_INTERRUPTIBLE); > + schedule_timeout(pause); > + if (pause < max_pause) > + pause <<= 1; > + spin_lock(&wb->list_lock); > } > spin_unlock(&wb->list_lock); > > -- > 1.7.1 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html