On Thu 05-06-08 19:24:13, Aneesh Kumar K.V wrote: > > Hmm, I've just got an idea that it may be better to introduce a new flag > > for wbc like range_cont and it would mean that we start scan at > > writeback_index (we use range_start if writeback_index is not set) and > > end with range_end. That way we don't have to be afraid of interference > > with other range_cyclic users and in principle, range_cyclic is originally > > meant for other uses... > > > > something like below ?. With this ext4_da_writepages have > > pgoff_t writeback_index = 0; > ..... > if (!wbc->range_cyclic) { > /* > * If range_cyclic is not set force range_cont > * and save the old writeback_index > */ > wbc->range_cont = 1; > writeback_index = mapping->writeback_index; > mapping->writeback_index = 0; > } > ... > mpage_da_writepages(..) > .. > if (writeback_index) > mapping->writeback_index = writeback_index; > return ret; > > > > mm: Add range_cont mode for writeback. > > From: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> > > Filesystems like ext4 needs to start a new transaction in > the writepages for block allocation. This happens with delayed > allocation and there is limit to how many credits we can request > from the journal layer. So we call write_cache_pages multiple > times with wbc->nr_to_write set to the maximum possible value > limitted by the max journal credits available. > > Add a new mode to writeback that enables us to handle this > behaviour. If mapping->writeback_index is not set we use > wbc->range_start to find the start index and then at the end > of write_cache_pages we store the index in writeback_index. Next > call to write_cache_pages will start writeout from writeback_index. > Also we limit writing to the specified wbc->range_end. > > > Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@xxxxxxxxxxxxxxxxxx> > --- > > include/linux/writeback.h | 1 + > mm/page-writeback.c | 10 +++++++++- > 2 files changed, 10 insertions(+), 1 deletions(-) I like it. I'm only not sure whether there cannot be two users of write_cache_pages() operating on the same mapping at the same time. Because then they could alter writeback_index under each other and that would probably result in unpleasant behavior. I think there can be two parallel calls for example from sync_single_inode() and sync_page_range(). In that case we'd need something like writeback_index inside wbc (or maybe just alter range_start automatically when range_cont is set?) so that parallel callers do no influence each other. Honza > diff --git a/include/linux/writeback.h b/include/linux/writeback.h > index f462439..0d8573e 100644 > --- a/include/linux/writeback.h > +++ b/include/linux/writeback.h > @@ -63,6 +63,7 @@ struct writeback_control { > unsigned for_writepages:1; /* This is a writepages() call */ > unsigned range_cyclic:1; /* range_start is cyclic */ > unsigned more_io:1; /* more io to be dispatched */ > + unsigned range_cont:1; > }; > > /* > diff --git a/mm/page-writeback.c b/mm/page-writeback.c > index 789b6ad..014a9f2 100644 > --- a/mm/page-writeback.c > +++ b/mm/page-writeback.c > @@ -882,6 +882,12 @@ int write_cache_pages(struct address_space *mapping, > if (wbc->range_cyclic) { > index = mapping->writeback_index; /* Start from prev offset */ > end = -1; > + } else if (wbc->range_cont) { > + if (!mapping->writeback_index) > + index = wbc->range_start >> PAGE_CACHE_SHIFT; > + else > + index = mapping->writeback_index; > + end = wbc->range_end >> PAGE_CACHE_SHIFT; > } else { > index = wbc->range_start >> PAGE_CACHE_SHIFT; > end = wbc->range_end >> PAGE_CACHE_SHIFT; > @@ -954,7 +960,9 @@ int write_cache_pages(struct address_space *mapping, > index = 0; > goto retry; > } > - if (wbc->range_cyclic || (range_whole && wbc->nr_to_write > 0)) > + if (wbc->range_cyclic || > + (range_whole && wbc->nr_to_write > 0) || > + wbc->range_cont) > mapping->writeback_index = index; > return ret; > } -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-ext4" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html