Re: [PATCH] ext4: Fix delalloc sync hang with journal lock inversion

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu 05-06-08 19:24:13, Aneesh Kumar K.V wrote:
> >   Hmm, I've just got an idea that it may be better to introduce a new flag
> > for wbc like range_cont and it would mean that we start scan at
> > writeback_index (we use range_start if writeback_index is not set) and
> > end with range_end. That way we don't have to be afraid of interference
> > with other range_cyclic users and in principle, range_cyclic is originally
> > meant for other uses...
> > 
> 
> something like below ?. With this ext4_da_writepages have
> 
> pgoff_t writeback_index = 0;
> .....
> if (!wbc->range_cyclic) {
> 	/*
> 	 * If range_cyclic is not set force range_cont
> 	 * and save  the old  writeback_index
> 	 */
> 	wbc->range_cont = 1;
> 	writeback_index = mapping->writeback_index;
> 	mapping->writeback_index = 0;
> }
> ...
> mpage_da_writepages(..)
> ..
> if (writeback_index)
>            mapping->writeback_index = writeback_index;
> return ret;
> 
> 
> 
> mm: Add range_cont mode for writeback.
> 
> From: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
> 
> Filesystems like ext4 needs to start a new transaction in
> the writepages for block allocation. This happens with delayed
> allocation and there is limit to how many credits we can request
> from the journal layer. So we call write_cache_pages multiple
> times with wbc->nr_to_write set to the maximum possible value
> limitted by the max journal credits available.
> 
> Add a new mode to writeback that enables us to handle this
> behaviour. If mapping->writeback_index is not set we use
> wbc->range_start to find the start index and then at the end
> of write_cache_pages we store the index in writeback_index. Next
> call to write_cache_pages will start writeout from writeback_index.
> Also we limit writing to the specified wbc->range_end.
> 
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx>
> ---
> 
>  include/linux/writeback.h |    1 +
>  mm/page-writeback.c       |   10 +++++++++-
>  2 files changed, 10 insertions(+), 1 deletions(-)
  I like it. I'm only not sure whether there cannot be two users of
write_cache_pages() operating on the same mapping at the same time. Because
then they could alter writeback_index under each other and that would
probably result in unpleasant behavior. I think there can be two parallel
calls for example from sync_single_inode() and sync_page_range().
  In that case we'd need something like writeback_index inside wbc (or
maybe just alter range_start automatically when range_cont is set?) so that
parallel callers do no influence each other.

								Honza

> diff --git a/include/linux/writeback.h b/include/linux/writeback.h
> index f462439..0d8573e 100644
> --- a/include/linux/writeback.h
> +++ b/include/linux/writeback.h
> @@ -63,6 +63,7 @@ struct writeback_control {
>  	unsigned for_writepages:1;	/* This is a writepages() call */
>  	unsigned range_cyclic:1;	/* range_start is cyclic */
>  	unsigned more_io:1;		/* more io to be dispatched */
> +	unsigned range_cont:1;
>  };
>  
>  /*
> diff --git a/mm/page-writeback.c b/mm/page-writeback.c
> index 789b6ad..014a9f2 100644
> --- a/mm/page-writeback.c
> +++ b/mm/page-writeback.c
> @@ -882,6 +882,12 @@ int write_cache_pages(struct address_space *mapping,
>  	if (wbc->range_cyclic) {
>  		index = mapping->writeback_index; /* Start from prev offset */
>  		end = -1;
> +	} else if (wbc->range_cont) {
> +		if (!mapping->writeback_index)
> +			index = wbc->range_start >> PAGE_CACHE_SHIFT;
> +		else
> +			index = mapping->writeback_index;
> +		end = wbc->range_end >> PAGE_CACHE_SHIFT;
>  	} else {
>  		index = wbc->range_start >> PAGE_CACHE_SHIFT;
>  		end = wbc->range_end >> PAGE_CACHE_SHIFT;
> @@ -954,7 +960,9 @@ int write_cache_pages(struct address_space *mapping,
>  		index = 0;
>  		goto retry;
>  	}
> -	if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0))
> +	if (wbc->range_cyclic ||
> +			(range_whole && wbc->nr_to_write > 0) ||
> +			wbc->range_cont)
>  		mapping->writeback_index = index;
>  	return ret;
>  }
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Reiser Filesystem Development]     [Ceph FS]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Linux FS]     [Yosemite National Park]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Device Mapper]     [Linux Media]

  Powered by Linux