Re: [PATCH, RFC] vm: Add an tuning knob for vm.max_writeback_pages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 31 2009, Theodore Tso wrote:
> On Mon, Aug 31, 2009 at 12:47:49PM +0200, Jens Axboe wrote:
> > It's because ext4 writepages sets ->range_start and wb_writeback() is
> > range cyclic, then the next iteration will have the previous end point
> > as the starting point. Looks like we need to clear ->range_start in
> > wb_writeback(), the better place is probably to do that in
> > fs/fs-writeback.c:generic_sync_wb_inodes() right after the
> > writeback_single_inode() call. This, btw, should be no different than
> > the current code, weird/correct or not :-)
> 
> Thanks for pointing it out.  After staring at the code, I now believe
> this is the best fix for now.  What do other folks think?
> 
>      	    	     	       	       - Ted
> 
> commit 39cac8147479b48cd45b768d184aa6a80f23a2f7
> Author: Theodore Ts'o <tytso@xxxxxxx>
> Date:   Mon Aug 31 17:00:59 2009 -0400
> 
>     ext4: Restore wbc->range_start in ext4_da_writepages()
>     
>     To solve a lock inversion problem, we implement part of the
>     range_cyclic algorithm in ext4_da_writepages().  (See commit 2acf2c26
>     for more details.)
>     
>     As part of that change wbc->range_start was modified by ext4's
>     writepages function, which causes its callers to get confused since
>     they aren't expecting the filesystem to modify it.  The simplest fix
>     is to save and restore wbc->range_start in ext4_da_writepages.
>     
>     Signed-off-by: "Theodore Ts'o" <tytso@xxxxxxx>
> 
> diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
> index d61fb52..ff659e7 100644
> --- a/fs/ext4/inode.c
> +++ b/fs/ext4/inode.c
> @@ -2749,6 +2749,7 @@ static int ext4_da_writepages(struct address_space *mapping,
>  	long pages_skipped;
>  	int range_cyclic, cycled = 1, io_done = 0;
>  	int needed_blocks, ret = 0, nr_to_writebump = 0;
> +	loff_t range_start = wbc->range_start;
>  	struct ext4_sb_info *sbi = EXT4_SB(mapping->host->i_sb);
>  
>  	trace_ext4_da_writepages(inode, wbc);
> @@ -2917,6 +2918,7 @@ out_writepages:
>  	if (!no_nrwrite_index_update)
>  		wbc->no_nrwrite_index_update = 0;
>  	wbc->nr_to_write -= nr_to_writebump;
> +	wbc->range_start = range_start;
>  	trace_ext4_da_writepages_result(inode, wbc, ret, pages_written);
>  	return ret;
>  }

I was going to suggest using range_start locally and not touching
->range_start, but I see you pass the wbc further down. So this looks
fine to me.

-- 
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux