Re: [PATCH] mm: Avoid livelocking of WB_SYNC_ALL writeback

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Nov 06, 2010 at 10:55:48AM +0800, Wu Fengguang wrote:
> [add CC to linux-mm list]
> 
> On Sat, Nov 06, 2010 at 06:30:38AM +0800, Christoph Hellwig wrote:
> > > +	/*
> > > +	 * In WB_SYNC_ALL mode, we just want to ignore nr_to_write as
> > > +	 * we need to write everything and livelock avoidance is implemented
> > > +	 * differently.
> > > +	 */
> > > +       if (wbc.sync_mode == WB_SYNC_NONE)
> > > +               write_chunk = MAX_WRITEBACK_PAGES;
> > > +       else
> > > +               write_chunk = LONG_MAX;
> 
> Good catch!
> 
> > 
> > I think it would be useful to elaborate here on how livelock avoidance
> > is supposed to work.
> 
> It's supposed to sync files in a big loop
> 
>         for each dirty inode
>             write_cache_pages()
>                 (quickly) tag currently dirty pages
>                 (maybe slowly) sync all tagged pages
> 
> Ideally the loop should call write_cache_pages() _once_ for each inode.
> At least this is the assumption made by commit f446daaea (mm:
> implement writeback livelock avoidance using page tagging).

The above scheme relies on the filesystems to not skip pages in
WB_SYNC_ALL mode. It seems necessary to add an explicit check at
least in the -mm tree.

Thanks,
Fengguang
---
writeback: check skipped pages on WB_SYNC_ALL 

In WB_SYNC_ALL mode, filesystems are not expected to skip dirty pages on
temporal lock contentions or non fatal errors, otherwise sync() will
return without actually syncing the skipped pages. Add a check to
catch possible redirty_page_for_writepage() callers that violate this
expectation.

Signed-off-by: Wu Fengguang <fengguang.wu@xxxxxxxxx>
---
 fs/fs-writeback.c |    1 +
 1 file changed, 1 insertion(+)

--- linux-next.orig/fs/fs-writeback.c	2010-11-07 00:20:43.000000000 +0800
+++ linux-next/fs/fs-writeback.c	2010-11-07 00:29:29.000000000 +0800
@@ -527,6 +527,7 @@ static int writeback_sb_inodes(struct su
 			 * buffers.  Skip this inode for now.
 			 */
 			redirty_tail(inode);
+			WARN_ON_ONCE(wbc->sync_mode == WB_SYNC_ALL);
 		}
 		spin_unlock(&inode_lock);
 		iput(inode);
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux